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"NUCLEIC ACID SEQUENCE AND METHOD FOR SELECTIVELY 
EXPRESSING A PROTEIN IN A TARGET CELL OR TISSUE" 

5 

PTKT.n DP THE INVENTION 

THIS INVENTION relates generally to gene 
therapy. More particularly, the present invention 

10 relates to a synthetic nucleic acid sequence and to a 
method for selectively expressing a protein in a 
target cell or tissue in which at least one existing 
codon of a parent nucleic acid sequence encoding the 
protein has been replaced with a synonymous codon. 

15 The invention also relates to production of virus 
particles using one or more synthetic nucleic acid 
sequences and the method according to the invention. 



BACKGROUND OF TH E! INVENTION 

20 

While gene therapy is of great clinical 
interest for treatment of gene defects, this therapy 
has not entered into mainstream clinical practice 
because selective delivery of genes to target tissues 

25 has proven extremely difficult. Currently, viral 
vectors are used, particularly retroviruses and 
adenovirus, which are to some extent selective. 
However, many vector systems are by their nature 
unable to produce stable integrants and some also 

3 0 invoke immune responses thereby preventing effective 
treatment. Alternatively, "naked" DNA is packaged in 
liposomes or other similar delivery systems. A major 
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problem to be overcome is that such gene delivery 
systems themselves are not tissue selective, whereas 
selective targeting of genes to particular tissues 
would be desirable for many disorders (e.g., cancer 
5 therapy) . While use of tissue specific promoters to 
target gene therapy has been effective in some animal 
models it has proven less so in man, and selective 
tissue specific promoters are not available for a wide 

range of tissues. 
10 The current invention has arisen 

unexpectedly from recent investigations exploring why 
papillomavirus (PV) late gene expression is restricted 
to differentiated keratinocytes. In this regard, it 
is known that PV late genes LI and L2 are only 
15 expressed in non-dividing differentiated keratinocytes 
(KCs) . Many investigators including the present 
inventors have been unable to detect significant PV LI 
and L2 protein expression when these genes are 
transduced or transfected into undifferentiated 
20 cultured cells, using a range of conventional 
constitutive viral promoters including retroviral long 
terminal repeats (LTRs) and the strong constitutive 
promoters of CMV and SV40. 

PV LI mRNA can however be efficiently 
25 translated in vitro using rabbit reticulocyte cell 
lysate, suggesting that there are no cellular 
inhibitors in the lysate interfering with translation 
of LI. The major difference between the in vitro and 
in vivo translation systems is that LI comprises the 
30 dominant LI mRNA in in vitro translation reactions, 
while it constitutes a minor fraction among the 
cellular mRNAs in intact cells. 



WO 99/02694 PCT/AU98/00530 

3 

In vivo, PV late proteins are not produced 
in undifferentiated KC. However, they are expressed 
in large quantity in highly differentiated KC. The 
mechanism of this tight control of late gene 
5 expression has been poorly understood, and searches by 
many groups for KC specific PV gene transcriptional 
control proteins have been unrewarding. 

Blockage to translation of LI mRNA in vivo 
has been attributed to sequences within the LI ORF 
10 (Tan et al . 1995, J. Virol. 69 5607-5620; Tan and 
Schwartz, 1995, J. Virol. 69 2932-2945). By using a 
Rev and Rev-responsive element of HIV, such inhibition 
could be overcome (Tan et al . 1995, supra). 
Accordingly, the inventors examined whether removal of 
15 putative "inhibitory sequences" in the LI ORF would 
allow production of LI protein in undifferentiated 
cells. Deletion mutagenesis of BPV LI to remove 
putative inhibitory sequences and expression of 
resultant deletion mutants in CV-1 cells revealed 
20 surprisingly that despite expression of LI mRNA, LI 
protein could not be detected. 

In view of the foregoing, it has been 
difficult hitherto to understand how papillomaviruses 
produce large amounts of LI protein in the late stage 
25 of their life cycle using this apparently 
"untranslatable" gene . 

Surprisingly, however, it has now been 
discovered that PV LI protein can be produced at 
substantially enhanced levels in an undifferentiated 
30 host cell by replacing existing codons of a native LI 
gene with synonymous codons used at relatively high 
frequency by genes of the undifferentiated host cell 
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compared to the existing codons . It has also been 
found unexpectedly that there are substantial 
differences in the relative abundance of particular 
isoaccepting transfer RNAs (tRNAs) in different cells 
5 or tissues and this plays a pivotal role in protein 
expression from a gene with a given codon usage or 
composition. This discovery has been reduced to 
practice in synthetic nucleic acid sequences and 
generic methods, which utilize codon alteration as a 
10 means for targeting expression of a protein to 
particular cells or tissues or alternatively, to cells 
in a specific state of differentiation. 

OFUTEdT OF THE INVENTION 

It is therefore an object of the present 
invention to provide a synthetic nucleic acid sequence 
and a method for selectively expressing a protein in a 
target cell or tissue which sequence and method 
ameliorate at least some of the disadvantages 
associated with the prior art. 

SUMMARY QF THE INVENTION 

2 5 Accordingly, in one aspect of the 
invention, there is provided a synthetic nucleic acid 
sequence capable of selectively expressing a protein 
in a target cell or tissue of a mammal, wherein said 
selective expression is effected by replacing at least 

3 0 one existing codon of a parent nucleic acid sequence 
with a synonymous codon to form said synthetic nucleic 
acid sequence. 



15 



20 
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Suitably, said synonymous codon corresponds 
to an iso-tRNA which, when compared to an iso-tRNA 
corresponding to the at least one existing codon, is 
in higher abundance in the target cell or tissue 
5 relative to one or more other cells or tissues of the 
mammal . 

Preferably, said synonymous codon 
corresponds to an iso-tRNA which, when compared to an 
iso-tRNA corresponding to the at least one existing 
10 codon, is in higher abundance in the target cell or 
tissue relative to a precursor cell or tissue. 

Alternatively, said synonymous codon 
corresponds to an iso-tRNA which, when compared to an 
iso-tRNA corresponding to the at least one existing 
15 codon, is in higher abundance in the target cell or 
tissue relative to a cell or tissue derived therefrom. 

Advantageously, said corresponding iso-tRNA 
in said target cell or tissue is at a level which is 
at least 110%, preferably at least 200%, more 
20 preferably at least 500%, and most preferably at least 
1000%, of that expressed in the or each other cell or 

tissue of the mammal. 

Alternatively, the synonymous codon may be 
selected from the group consisting of (1) a codon used 

25 at relatively high frequency by genes, preferably 
highly expressed genes, of the target cell or tissue, 
(2) a codon used at relatively high frequency by 
genes, preferably highly expressed genes, of the or 
each other cell or tissue, (3) a codon used at 

30 relatively high frequency by genes, preferably highly 
expressed genes, of the mammal, (4) a codon used at 
relatively low frequency by genes of the target cell 
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or tissue, (5) a codon used at relatively low 
frequency by genes of the or each other cell or 
tissue, (6) a codon used at relatively low frequency 
by genes of the mammal, (7) a codon used at relatively 
5 high frequency by genes of another organism, and (8) a 
codon used at relatively low frequency by genes of 

another organism. 

In a preferred embodiment, the at least one 
existing codon and the synonymous codon are preferably 
10 selected such that said protein is expressed from said 
synthetic nucleic acid sequence in said target cell or 
tissue at a level which is at least 110%, preferably 
at least 200%, more preferably at least 500%, and most 
preferably at least 1000%, of that expressed from said 
15 parent nucleic acid sequence in said target cell or 
tissue. 

In another aspect, the invention resides in 
a method for selectively expressing a protein in a 
target cell or tissue of a mammal, wherein said 
20 selective expression is effected by replacing at least 
one existing codon of a parent nucleic acid sequence 
with a synonymous codon to form said synthetic nucleic 

acid sequence. 

Preferably, the method is further 

25 characterized by the steps of: 

(a) replacing at least one existing codon 
of a parent nucleic acid sequence encoding said 
protein with a synonymous codon to produce a synthetic 
nucleic acid sequence having altered translational 
30 kinetics compared to said parent nucleic acid sequence 
such that said protein is selectively expressible in 
said target cell or tissue; 
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(b) administering to the mammal and 
introducing into said target cell or tissue, or a 
precursor cell or precursor tissue thereof, said 
synthetic nucleic acid sequence operably linked to one 

5 or more regulatory nucleotide sequences; and 

(c) selectively expressing said protein 

in said target cell or tissue. 

Preferably, the method further includes, 

prior to step (a) : 

(i) measuring relative abundance of 
different isoacceptor transfer RNAs in said target 
cell or tissue, and in one or more other cells or 

tissues of the mammal; and 

(ii) identifying said at least one 
existing codon and said synonymous codon based on said 
measurement, wherein said synonymous codon corresponds 
to an iso-tRNA which, when compared to an iso-tRNA 
corresponding to the existing codon, is in higher 
abundance in said target cell or tissue relative to 
the or each other cell or tissue of the mammal. 

Suitably, step (ii) above is further 
characterized in that said synonymous codon 
corresponds to an iso-tRNA which, when compared to an 
iso-tRNA corresponding to the at least one existing 
codon, is in higher abundance in the target cell or 
tissue relative to a precursor cell or tissue. 

Alternatively, step (ii) above is further 
characterized in that said synonymous codon 
corresponds to an iso-tRNA which, when compared to an 
iso-tRNA corresponding to the at least one existing 
codon, is in higher abundance in the target cell or 
tissue relative to a cell or tissue derived therefrom. 
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Alternatively, the method further includes, 
pri or to step (a), identifying said at least one 
existing codon and said synonymous codon based on 
respective relative frequencies of particular ccdons 
5 used by genes selected from the group consisting of 
(I) genes of the target cell or tissue, (II) genes of 
the or each other cell or tissue, (III) genes of the 
mammal, and (IV) genes of another organism. 

In yet another aspect, the invention 
10 provides a method for expressing a protein in a target 

= firqt nucleic acid sequence 
cell or tissue from a first nucxe 

including the steps of: 

introducing into said target cell or 
tissue, or a precursor cell or precursor tissue 
15 thereof, a second nucleic acid sequence encoding at 
least one isoaccepting transfer RNA wherein said 
second nucleic acid sequence is operably linRed to one 
or more regulatory nucleotide sequences, and wherein 
said at least one isoaccepting transfer KNA is 
20 normally in relatively low abundance in said targe, 
cell or tissue and corresponds to a codon of said 

first nucleic acid sequence. 

In a further aspect, the invention extends 
to a method for producing a virus particle in a 
25 cycling euKaryotic cell, said virus particle 
comprising at least one protein necessary for asseufcly 
of said virus particle, wherein said at least one 
protein is not expressed in said cell from a parent 
nucleic acid sequence at a level sufficient to permit 
virus assembly therein, said method including the 



30 

steps of: 
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(a) replacing at least one existing codon 
of said parent nucleic acid sequence with a synonymous 
codon to produce a synthetic nucleic acid sequence 
having altered translational kinetics compared to said 

5 parent nucleic acid sequence such that said at least 
one protein is expressible from said synthetic nucleic 
acid sequence in said cell at a level sufficient to 
permit virus assembly therein; 

(b) introducing into said cell or a 
10 precursor thereof said synthetic nucleic acid sequence 

operably linked to one or more regulatory nucleotide 

sequences ; and 

(c) expressing said at least one protein 
in said cell in the presence of other viral proteins 

15 required for assembly of said virus particle to 
thereby produce said virus particle. 

In yet a further aspect of the invention, 
there is provided a method for producing a virus 
particle in a cycling cell, said virus particle 
2 0 comprising at least one protein necessary for assembly 
of said virus particle, wherein said at least one 
protein is not expressed in said cell from a parent 
nucleic acid sequence at a level sufficient to permit 
virus assembly therein, and wherein at least one 
25 existing codon of said parent nucleic acid sequence is 
rate limiting for the production said at least one 
protein to said level, said method including the step 
of introducing into said cell a nucleic acid sequence 
capable of expressing therein an isoaccepting transfer 
30 RNA specific for said at least one codon. 
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R fiTRF nB.grRTT P T ^ N ™ the DRAWINGS 

Figure 1A depicts the nucleotide sequence 
(SEQ ID NO:l) and deduced amino acid sequence (SEQ ID 
5 NO: 2) of BPV1 LI. Amino acids (in single letter code) 
are presented below the second nucleotide of each 
codon. Mutations introduced into the genes are 
indicated above the corresponding nucleotides of the 
original sequence. Horizontal lines indicate the 
10 sites and enzymes used for cloning. This replacement 
of nucleotides resulted in a nucleic acid sequence 
encoding BPV-1 LI polypeptide with an amino acid 
sequences identical to the wild type, but having 
synonymous codons that are frequently used by 

15 mammalian genes. 

Figure IB shows the nucleotide sequence 

(SEQ ID NO: 5) and deduced amino acid sequence (SEQ ID 
NO: 6) relating to BPV1 L2 ORF. Amino acids (in single 
letter code) are presented below the second nucleotide 

2 0 of each codon. Mutations introduced into the genes 
are indicated above the corresponding nucleotides of 
the original sequence. Horizontal lines indicate the 
sites and enzymes used for cloning. This replacement 
of nucleotides resulted in a nucleic acid sequence 

2 5 encoding BPV-1 L2 polypeptide with an amino acid 
sequences identical to the wild type, but having 
synonymous codons that are frequently used by 

mammalian genes. 

Figure 1C depicts the nucleotide sequence 

30 (SEQ ID NO: 9) and deduced amino acid sequence (SEQ ID 
NO: 10) of green fluorescent protein (GFP) . Amino 
acids (in single letter code) are presented below the 

SUBSTITUTE SHEET (RULE 26) 
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second nucleotide of each codon. Mutations introduced 
into the genes are indicated above the corresponding 
nucleotides of the original sequence. Horizontal 
lines indicate the sites and enzymes used for cloning. 
5 This replacement of nucleotides resulted in a nucleic 
acid sequence encoding GFP polypeptide with an amino 
acid sequence identical to the native sequence 
modified for optimal expression in eukaryotic cells, 
but having synonymous codons that are frequently used 

10 by papillomavirus genes. 

Figure 2A shows detection of LI protein 
expressed from synthetic and wild type BPV1 LI genes. 
Cos-1 cells were transfected with a synthetic LI 
expression plasmid pCDNA/HBLl , and a wild type LI 
IS expression plasmid pCDNA/BPVLlwt . The expression of 
LI was detected by immunof luorescent staining. Cells 
were fixed after 36 hrs and incubated with rabbit 
anti-BPVl LI antiserum, followed by FITC-conjugated 
goat ant i- rabbit IgG antibody. 
20 Figure 2B shows detection by Western blot 

of LI protein from Cos-1 cells transfected with 
pCDNA/HBLl and pCDNA/BPVLlwt . 

Figure 2C shows a Northern blot in which LI 
nttNA extracted from transfected cells was probed with 
25 »P-labeled probes produced from wild type LI sequence. 
The amount of mRNA loaded in respective lanes was 
examined by hybridization of the mRNA sample with a 
gapdh probe. 

Figure 3A shows detection of L2 protein 
30 expressed from synthetic and wild type BPV1 L2 genes. 
Cos-1 cells were transfected with a synthetic L2 
expression plasmid pCDNA/HBL2 , and a wild type L2 

SUBSTITUTE SHEET (RULE 26) 
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expression plasmid pCDNA/BPVL2wt . The expression of 
L2 was detected by immunof luore scent staining. Cells 
were fixed after 3 6 hrs and incubated with rabbit 
anti-BPVl L2 antiserum, followed by FITC-conjugated 
5 goat anti -rabbit IgG antibody. 

Figure 3B shows detection by Western blot 
of L2 protein from Cos-1 cells transfected with 
pCDNA/HBL2 and pCDNA/BPVL2wt . 

Figure 3C shows a Northern blot in which L.2 
10 mRNA extracted from transfected cells was probed with 
32 P- labeled probes produced from wild type L2 sequence. 
The amount of mRNA loaded in respective lanes was 
examined by hybridization of the mRNA sample with a 
gapdh probe. 

15 Figure 4 shows in vitro translation of 

BPVL1 sequences, wild type BPVL1 (wt) or synthetic LI 
(HB) using rabbit reticulocyte lysate or wheat germ 
extract in the presence of 3S S-methionine . In the top 
panel, wt LI or HB LI plasmid DNA was added to the T7 
20 DNA polymerase- coupled in vitro translation system. 
LI protein was detected by Western blot analysis. In 
the bottom panel, the translation efficiency of wt LI 
or HB LI sequences in the presence or absence of tRNA 
was compared. Translation was carried out in rabbit 
25 reticulocyte lysate (rabbit) or wheat germ extract 
(wheat) , and samples were collected every two minutes 
starting from minute 8. Left side of lower panel 
indicates if 10" 5 M bovine liver or yeast tRNA was 
supplied. 

30 Figure 5A is a schematic representation of 

plasmids used to determine L2 expression from BPV 
cryptic promoter (s) . The wild type LI sequence and 

SUBSTITUTE SHEET (RULE 26) 
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most of the wild type L2 sequence were deleted from 
the BPV1 genome by BamHI and Hindi I I digestion and the 
remaining BPV1 sequence (in yellow) was cloned into 
pUC18. Wild type or synthetic humanized L2 sequences 
5 (in red) were inserted into the BamHI site of the BPV1 
genome. The position of the inserted SV40 ori 
sequence (in white) is indicated. The plasmid in 
which modified L2 was used but without SV40 ori 
sequence was also used as a control. The plasmids 
10 were transfected into Cos-1 cells and the expression 
of L2 protein was determined using BPV1 L.2 -specific 
polyclonal antiserum followed by FITC- linked anti 
rabbit IgG. 

Figure 5B shows expression of L2 protein 
15 from native papillomavirus promoter. The plasmids 
shown in Figure 5A were used to transfect Cos-1 cells 
and the expression of L2 protein was determined using 
BPV1 L2-specific polyclonal antiserum followed by 
FITC-linked anti rabbit IgG. A mock transfection in 
20 which the cells did not receive plasmid was used as 
control . 

Figure 6 shows expression of GFP in Cos-1 
cells transfected with wild-type gfp (wt) or a 
synthetic gfp gene carrying codons used at relatively 

25 high frequency by papillomavirus genes (p) . The mRNA 
extracted from cells transfected with gfp or P gfp was 
probed with 32 P-labeled gfp probe and is shown on the 
right panel, using gapdh as a reference gene. 

Figure 7 shows the expression pattern of 

3 0 GFP in vivo from wild- type gfp gene, or a synthetic 
gfp gene carrying codons used at relatively high 
frequency by papillomavirus genes. Using a gene gun, 

SUBSTITUTE SHEET (RULE 26) 
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mice were shot with PGFP (left panel) and GFP (right 
panel) expression plasmids encoding GFP protein. A 
transverse section of the mouse skin section shows 
where the gfp gene is expressed. Bright -field 
5 photographs of the same section where dermis (D) 
epidermis (E) are highlighted are shown to identify 
the location of fluorescence in the epidermis. Arrows 
indicate fluorescent signals. 

10 DETAILED DESCRIPTION 

The present invention arises from the 
unexpected discovery that the relative abundance of 
different isoaccepting transfer RNAs varies in 
15 different cells or tissues, or alternatively in cells 
or tissues in different states of differentiation or 
in different stages of the cell cycle, and that such 
differences may be exploited together with codon 
composition of a gene to regulate and direct 

20 expression of a protein to a particular cell or 
tissue, or alternatively to a cell or tissue in a 
specific state of differentiation or in a specific 
stage of the cell cycle. According to the present 
invention, this selective targeting is effected by 

25 replacing at least one existing codon of a parent 
nucleic acid sequence encoding the protein with a 
synonymous codon . 

Replacement of synonymous codons for 
existing codons is not new per se. In this regard, we 

3 0 refer to International Application Publication No WO 
96/09378 which utilizes such substitution to provide a 
method of expressing proteins of eukaryotic and viral 

SUBSTITUTE SHEET (RULE 26) 
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origin at high levels in in vitro mammalian cell 
culture systems, the main thrust of the method being 
the harvesting of such proteins. In distinct 

contrast, the present invention utilizes substitution 
5 of one or more codons in a gene for targeting 
expression of the gene to particular cells or tissues 
with the ultimate aim of facilitating gene therapy as 
described herein. 

The term M synonymous codon" as used herein 
10 refers to a codon having a different nucleotide 
sequence to an existing codon but encoding the same 
amino acid as the existing codon. 

By "isoaccepting transfer RNA" is meant one 
or more transfer RNA molecules that differ in their 
15 anticodon structure but are specific for the same 
amino acid. 

Throughout this specification, unless the 
context requires otherwise, the words "comprise", 
comprises" and "comprising" will be understood to 
20 imply the inclusion of a stated integer or group of 
integers but not the exclusion of any other integer or 
group of integers. 

Selection of syno nymous codons 

25 Determination of relative abundance of 

different tRNA species in different cells 

Advantageously, the synonymous codon 
corresponds to an iso- tRNA (iso-tRNA) which, when 
compared to an iso-tRNA corresponding to the at least 

30 one existing codon, is in higher abundance in the 
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target cell or tissue relative to one or more other 
cells or tissues of the mammal. 

Any method for determining the relative 
abundance of an iso-tRNA in two or more cells or 
5 tissues may be employed. For example, such method may 
include isolating two or more particular cells or 
tissues from a mammal, preparing an RNA extract from 
each cell or tissue which extract includes tRNA, and 
probing each extract respectively with different 
10 nucleic acid sequences each being specific for a 
particular iso-tRNA to determine the relative 
abundance of an iso-tRNA between the two or more cells 
or tissues. 

Suitable methods for isolating particular 

15 cells or tissues are well known to those of skill in 
the art. For example, one can take advantage of one 
or more particular characteristics of a cell or tissue 
to specifically isolate the cell or tissue from a 
heterogeneous population . Such characteristics 

20 include, but are not limited to, anatomical location 
of a tissue, cell density, cell size, cell morphology, 
cellular metabolic activity, cell uptake of ions such 
as Ca 2+ , K*, and H + ions, cell uptake of compounds such 
as stains, markers expressed on the cell surface, 

25 cytokine expression, protein fluorescence, and 
membrane potential. Suitable methods that may be used 
in this regard include surgical removal of tissue, 
flow cytometry techniques such as fluorescence- 
activated cell sorting (FACS) , immunoaf f inity 

30 separation (e.g., magnetic bead separation such as 
Dynabead™ separation), density separation (e.g., 
metrizamide, Percoll™, or Ficoll™ gradient 

SUBSTITUTE SHEET (RULE 26) 
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centrifugation) , and cell-type specific density 
separation {e.g., Lymphoprep™) . For example, dividing 
cells or blast cells may be separated from non- 
dividing cells or resting cells according to cell size 
5 by FACS or metrizamide gradient separation. 

Any suitable method for isolating total RNA 
from a cell or tissue may be used. Typical procedures 
contemplated by the invention are described in CURRENT 
PROTOCOLS IN MOLECULAR BIOLOGY (Ausubel , et al . , eds) 

10 (John Wiley & Sons, Inc. 1997), hereby incorporated by 
reference, at page 4.2.1 through page 4.2.7. 
Preferably, techniques which favor isolation of tRNA 
are employed as, for example, described in 
Brunngraber, E.F. (1962, Biochem. Biophys. Res. 

15 Commun. 8:1-3) which is hereby incorporated by 
reference . 

The probing of an RNA extract is suitably 
effected with different oligonucleotide sequences each 
being specific for a particular i so -tRNA. Of course 

20 it will be appreciated that for a given mammal, 
oligonucleotide sequences would need to be selected 
which hybridize specifically with particular iso-tRNA 
sequences expressed by the mammal. Such selection is 
well within the realm of one of ordinary skill in the 

25 art based a known iso-tRNA sequence. For example, in 
the case of a mouse, exemplary oligonucleotide 
sequences which may be used include those described in 
Gauss and Sprinzel (1983, Nucleic Acids Res. 11 (1) ) 
hereby incorporated by reference. In this respect, 

30 the oligonucleotide sequences may be selected from the 
group consisting of: 
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10 



15 



5' - 


- TAAGGACTGTAAGACTT - 


3' 


(SEQ 


ID 


NO: 


13) 


for 


Ala GCA 


5' ■ 


- CGAGCCAGCCAGGAGTC - 


3' 


(SEQ 


ID 


NO: 


14) 


for 


Arg 00 * 


5' ■ 


- CTAGATTGGCAGGAATT - 


3' 


(SEQ 


ID 


NO: 


15) 


for 


Asn** 


5' 


- TAAGATATATAGATTAT - 


3' 


(SEQ 


ID 


NO: 


16) 


for 


Asp 0 * 


5' 


- AAGTCTTAGTAGAGATT - 


3' 


(SEQ 


ID 


NO: 


17) 


for 


Cys™ 


5' 


- TATTTCTACACAGCATT - 


3' 


(SEQ 


ID 


NO: 


18) 


for 


Glu 0 ™ 


5' 


- CTAGGACAATAGGAATT - 


3' 


(SEQ 


ID 


NO: 


19) 


for 


Gln^ 


5' 


- TACTCTCTTCTGGGTTT - 


3' 


(SEQ 


ID 


NO: 


20) 


for Gly 00 * 


5' 


- TGCCGTGACTCGGATTC - 


3' 


(SEQ 


ID 


NO: 


21) 


for 


His 00 


5' 


- TAGAAATAAGAGGGCTT - 


■3' 


(SEQ 


ID 


NO: 


22) 


for 


Ile ATC 


5' 


- TACTTTTATTTGGATTT - 


■3' 


(SEQ 


ID 


NO 


23) 


for 


Leu CTA 


5' 


- TATTAGGGAGAGGATTT - 


-3' 


(SEQ 


ID 


NO 


.24) 


for 


Leu OT 


5' 


- TCACTATGGAGATTTTA - 


-3' 


(SEQ 


ID 


NO 


;25) 


for 


Lys^ 


5' 


- CGCCCAACGTGGGGCTC - 


-3' 


(SEQ 


ID 


NO 


:26) 


for 


Lys^ 


5' 


- TAGTACGGGAAGGATTT - 


-3' 


(SEQ 


ID 


NO 


:27) 


for 


Met -elong 


5 ' 


- TGTTTATGGGATACAAT - 


-3' 


(SEQ 


ID 


NO 


:28) 


for 


Phe™ 


5' 


- TCAAGAAGAAGGAGCTA- 


-3' 


(SEQ 


ID 


NO 


:29) 


for 


Pro CCA 


5' 


- GGGCTCGTCCGGGATTT ■ 


-3' 


(SEQ 


ID 


NO 


:30) 


for 


Pro CCI 


5' 


- ATAAGAAAGGAAGATCG 


-3' 


(SEQ 


ID 


NO 


:31) 


for 


Ser AGC 


5' 


- TGTCTTGAGAAGAGAAG 


-3' 


(SEQ 


ID 


NO 


:32) 


for 


Thr ACA 


5' 


- TGGTAAAAAGAGGATTT 


-3' 


(SEQ 


ID 


NO 


:33) 


for 


Tyr TAC 


5' 


- TCAGAGTGTTCATTGGT 


-3' 


(SEQ 


ID 


NO 


:34) 


for 


Val™ 



25 Typically, the relative abundance of iso- 

tRNA species may be determined by blotting techniques 
that include a step whereby sample RNA or tRNA extract 
is immobilized on a matrix (preferably a synthetic 
membrane such as nitrocellulose) , a hybridization 

3 0 step, and a detection step. Northern blotting may be 
used to identify an RNA sequence that is complementary 
to a nucleic acid probe. Alternatively, dot blotting 
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and slot blotting can be used to identify 
complementary DNA/RNA or RNA/RNA nucleic acid 
sequences. Such techniques are well known by those 
skilled in the art, and have been described in 
Ausubel, et al (supra) at pages 2.9.1 through 2.9.20. 

According to such methods, a sample of tRNA 
immobilized on a matrix is hybridized under stringent 
conditions to a complementary nucleotide sequence 
(such as those mentioned above) which is labeled, for 
example, radioactively, enzymatically or 

f luorochromatically . 

"Stringency" as used herein, refers to the 
temperature and ionic strength conditions, and 
presence or absence of certain organic solvents, 
during hybridization. The higher the stringency, the 
higher will be the degree of complementarity between 
the immobilized nucleotide sequences (i.e., iso-tRNA) 
and the labeled oligonucleotide sequence. For a 
discussion of typical stringent conditions that may be 
used, see CURRENT PROTOCOLS IN MOLECULAR BIOLOGY supra 
at pages 2.10.1 to 2.10.16, and Sambrook et al in 
MOLECULAR CLONING. A LABORATORY MANUAL (Cold Spring 
Harbor Press, 1989), hereby incorporated by reference, 
at sections 1.101 to 1.104. 

While stringent washes are typically 
carried out at temperatures from about 42°C to 68°C, 
one skilled in the art will appreciate that other 
temperatures may be suitable for stringent conditions. 
Maximum hybridization typically occurs at about 20° to 
25° below the T m for formation of a DNA-DNA hybrid. It 
is well known in the art that the T n is the melting 
temperature, or temperature at which two complementary 
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nucleic acid sequences dissociate. Methods for 
estimating T ra are well known in the art (see CURRENT 
PROTOCOLS IN MOLECULAR BIOLOGY supra at page 2.10.8). 
Maximum hybridization typically occurs at about 10° to 
5 15° below the T ra for a DNA-RNA hybrid. 

Other stringent conditions are well known 
in the art. A skilled addressee will recognize that 
various factors can be manipulated to optimize the 
specificity of the hybridization. Optimization of the 
10 stringency of the final washes can serve to ensure a 
high degree of hybridization. 

Methods for detecting labeled nucleotide 
sequences hybridized to an immobilized nucleotide 
sequence are well known to practitioners in the art. 
15 Such methods include autoradiography, 

chemiluminescent, fluorescent and colorimetric 
detection. 

Advantageously, the relative abundance of 
an iso-tRNA in two or more cells or tissues may be 
20 determined by comparing the respective levels of 
binding of a labeled nucleotide sequence specific for 
the iso-tRNA to equivalent amounts of immobilized RNA 
obtained from the two or more cells or tissues. 
Similar comparisons are suitably carried out to 
25 determine the respective relative abundance of other 
iso-tRNAs in the two or more cells or tissues. One of 
ordinary skill in the art will thereby be able to 
determine a relative tRNA abundance table (see for 
example TABLE 2) for different cells or tissues. From 
3 0 such comparisons, one or more synonymous codons may be 
selected such that the or each synonymous codon 
corresponds to an iso-tRNA which, when compared to an 
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iso-tRNA corresponding to an existing codon of the 
parent nucleic acid sequence, is in higher abundance 
in the target cell or tissue relative to other cells 
or tissues of the mammal. 
5 Advantageously, a synonymous codon is 

selected such that its corresponding iso-tRNA in the 
target cell or tissue is at a level which is at least 
110%, preferably at least 200%, more preferably at 
least 500%, and most preferably at least 1000%, of 
10 that expressed in the or each other cell or tissue of 
the mammal . 

Suitably, synonymous codons for selective 
expression of a protein in a differentiated cell, 
preferably a differentiated keratinocyte , are selected 
15 from the group consisting of gca (Ala) , cuu (Leu) and 
cua (Leu) . 

Synonymous codons for selective expression 
of a protein in an undifferentiated cell, preferably 
an undifferentiated keratinocyte, are suitably 
20 selected from the group consisting of cga (Arg) , cci 
(Pro) and aag (Asn) . 



25 



30 



Analysis of codon usage 

Alternatively, synonymous codons may be 
selected by analyzing the frequency at which codons 
are used by genes expressed in (i) particular cells or 
tissues, (ii) substantially all cells or tissues of 
the mammal, or (iii) an organism which may infect 
particular cells or tissues of the mammal. 

Codon frequency tables as well as suitable 
methods for determining frequency of codon usage in an 
organism are described, for example, in an article by 
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Sharp et al (1988, Nucleic Acids Res. 16 8207-8211) 
which is hereby incorporated by reference. 

The relative level of gene expression 
(e.g., detectable protein expression vs no detectable 
protein expression) can provide an indirect measure of 
the relative abundance of specific iso-tRNAs expressed 
in different cells or tissues. For example, a virus 
may be capable of propagating within a first cell or 
tissue (which may include a cell or tissue at a 
specific stage of differentiation) but may be 
substantially incapable of propagating in a second 
cell or tissue (which may include a cell or tissue at 
another stage of differentiation) . Comparison of the 
pattern of codon usage by genes of the virus with the 
pattern of codon usage by genes expressed in the 
second cell or tissue may thus provide indirectly a 
set of synonymous codons which correspond to iso-tRNAs 
expressed at relatively high abundance in the first 
cell or tissue relative to the second cell or tissue 
and Wee versa. Simultaneously, the above comparison 
may also provide indirectly a set of synonymous codons 
which correspond to iso-tRNAs expressed at relatively 
high abundance in the second cell or tissue relative 

to the first cell or tissue. 

From the foregoing, a synonymous codon 
according to the invention may correspond to a codon 
including, but not limited to, (D a codon used at 
relatively high frequency by genes, preferably highly 
expressed genes, of the target cell or tissue, (2) a 
codon used at relatively high frequency by genes, 
preferably highly expressed genes, of the or each 
other cell or tissue, (3) a codon used at relatively 
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high frequency by genes, preferably highly expressed 
genes, of the mammal, (4) a codon used at relatively 
low frequency by genes of the target cell or tissue, 
(5) a codon used at relatively low frequency by genes 
5 of the or each other cell or tissue, (6) a codon used 
at relatively low frequency by genes of the mammal, 
(7) a codon used at relatively high frequency by genes 
of another organism, and (8) a codon used at 
relatively low frequency by genes of another organism. 
10 For example, codons used at a relatively 

high frequency by genes, preferably highly expressed 
genes, of the mammal may be selected from the group 
consisting of: cue (Leu), cuu, (Leu), cug (Leu), uua 
(Leu), uug (Leu); egg (Arg) , cgc (Arg) , aga (Arg) , agg 
15 (Arg) ; agu (Ser) , age (Ser) , ucu (Ser) , ucc (Ser) , and 
uca (Ser) . Alternatively, such codons may include auu 
(He), auc (He); guu (Val) , guc (Val) , gug (Val) ; acu 
(Thr) , acc (Thr) , aca (Thr) ; gcu (Ala) , gec (Ala) , gca 
(Ala) ; cag (Glu) ; ggc (Gly) , gga (Gly) , ggg (Gly) . 
20 Codons used at a relatively low frequency 

by genes of the mammal are described, for example, in 
Sharp et al (1988, supra). Such codons may comprise 
cua (Leu) ; cga (Arg) , cgu (Arg) ; ucg (Ser) . 
Alternatively, such codons may include aua (He) ; gua 
25 (Val) ; acg (Thr) ; gcg (Ala) ; caa (Glu) ; ggu (Gly) . 

ry fnpi-r-nrl-.-i o n nf svnf.HpH r nncTPir acid RqguenCeS 

The step of replacing synonymous codons for 
existing codons may be effected by any suitable 
technique. For example, in vitro mutagenesis methods 
30 may be employed which are well known to those of skill 
in the art. Suitable mutagenesis methods are 
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described for example in the relevant sections of 
Ausubel, et al . (supra) and of Sambrook, et al . , 
(supra) which are hereby incorporated by reference. 
Alternatively, suitable methods for altering DNA are 
5 set forth, for example, in U.S. Patent Nos 4,184,917, 
4,321,365 and 4,351,901, which are hereby incorporated 
by reference. Instead of in vitro mutagenesis, the 
second nucleic acid sequence may be synthesized de 
novo using readily available machinery. Sequential 
10 synthesis of DNA is described, for example, in U.S. 
Patent No 4,293,652, which is hereby incorporated by 
reference. However, it should be noted that the 
present invention is not dependent on and not directed 
to any one particular technique for replacing 
15 synonymous codons for existing codons. 

It is not necessary to replace all the 
existing codons of the parent nucleic acid sequence 
with synonymous codons each corresponding to a iso- 
tRNA expressed in relatively high abundance in the 
20 target cell compared to other cells. Increased 
expression may be accomplished even with partial 
replacement. Preferably, the replacing step affects 
5%, 10%, 15%, 20%, 25%, 30%, more preferably 35%, 40%, 
50%, 60%, 70% or more of the existing codons of the 
25 parent nucleic acid sequence. 

The parent nucleic acid sequence is 
preferably a natural gene. By "natural gene" is meant 
a gene that naturally encodes the protein. However, 
it is possible that the parent nucleic acid sequence 
30 encodes a protein that is not naturally-occurring but 
has been engineered using recombinant techniques. 
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The parent nucleic acid sequence need not 
be obtained from the mammal but may be obtained from 
any suitable source such as from a eukaryotic or 
prokaryotic organism. For example, the parent nucleic 
5 acid sequence may be obtained from another mammal or 
other animal. Alternatively, the parent nucleic acid 
sequence may be obtained from a pathogenic organism. 
In such a case, a natural host of the pathogenic 
organism is preferably a mammal. For example, the 
10 pathogenic organism may be a yeast, bacterium or 
virus . 

For example, suitable proteins which may be 
used for selective expression in accordance with the 
invention include, but are not limited to the cystic 

15 fibrosis transmembrane conductance regulator (CFTR) 
protein, and adenosine deaminase (ADA) . In the case 
of CFTR, a parent nucleic acid sequence encoding the 
CFTR protein which may be utilized to produce the 
synthetic nucleic acid sequence is described, for 

20 example, in Riordan et al (1989, Science 245 1066- 
1073), and in the GenBank database under Accession No. 
HUMCFTRM , which are hereby incorporated by reference. 

The term "nucleic acid sequence" as used 
herein designates mRNA, RNA, cRNA, cDNA or DNA. 

2 5 Regulatory nucleotide sequences which may 

be utilized to regulate expression of the synthetic 
nucleic acid sequence include, but are not limited to, 
a promoter, an enhancer, and a transcriptional 
terminator. Such regulatory sequences are well known 

30 to those of skill in the art. 

Synthetic nucleic acid sequences according 
to the invention may be operably linked to one or more 
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regulatory sequences in the form of an expression 
vector. By "vector" is meant a nucleic acid molecule, 
preferably a DNA molecule derived, for example, from a 
plasmid, bacteriophage, or mammalian or insect virus, 
5 into which a synthetic nucleic acid sequence may be 
inserted or cloned. A vector preferably contains one 
or more unique restriction sites and may be capable of 
autonomous replication in a defined host cell 
including the target cell or tissue or a precursor 

10 cell or precursor tissue thereof, or be integratable 
with the genome of the defined host such that the 
cloned sequence is reproducible. Thus, by "expression 
vector" is meant any autonomous element capable of 
directing the synthesis of a protein. Such expression 

15 vectors are well known by practitioners in the art. 

The term "precursor cell" as used herein 
refers to a cell that gives rise to the target cell. 

The invention also contemplates synthetic 
nucleic acid sub-sequences encoding desired portions 

20 of the protein. A nucleic acid sub-sequence encodes a 
domain of the protein having a function associated 
therewith and preferably encodes at least 10, 20, 50, 
100, 150, or 500 contiguous amino acids of the 
protein. 

25 The step of introducing the synthetic 

nucleic acid sequence into a target cell will differ 
depending on the intended use and or species, and may 
involve non-viral and viral vectors, cationic 
liposomes, retroviruses and adenoviruses such as, for 

30 example, described in Mulligan, R.C., (1993 Science 
260 926-932) which is hereby incorporated by 
reference. Such methods may include: 
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(i) Local application of the synthetic 
nucleic acid sequence by injection (Wolff et al., 
1990, Science 247 1465-1468, which is hereby 
incorporated by reference) , surgical implantation, 
5 instillation or any other means. This method may also 
be used in combination with local application by 
injection, surgical implantation, instillation or any 
other means, of cells responsive to the protein 
encoded by the synthetic nucleic acid sequence so as 

10 to increase the effectiveness of that treatment. This 
method may also be used in combination with local 
application by injection, surgical implantation, 
instillation or any other means, of another factor or 
factors required for the activity of said protein. 

15 (ii) General systemic delivery by 

injection of DNA, (Calabretta et al . , 1993, Cancer 
Treat. Rev. 19 169-179, which is hereby incorporated 
by reference) , or RNA, alone or in combination with 
liposomes (Zhu et al . , 1993, Science 261 209-212, 

2 0 which is hereby incorporated by reference) , viral 

capsids or nanoparticles (Bertling et al . , 1991, 
Biotech. Appl . Biochem. 13 390-405, which is hereby 
incorporated by reference) or any other mediator of 
delivery. Improved targeting might be achieved by 
25 linking the synthetic nucleic acid sequence to a 
targeting molecule (the so-called "magic bullet" 
approach employing for example, an antibody) , or by 
local application by injection, surgical implantation 
or any other means, of another factor or factors 

3 0 required for the activity of the protein produced from 

said synthetic nucleic acid sequence, or of cells 
responsive to said protein. 
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(iii) Injection or implantation or delivery 
by any means, of cells that have been modified ex vivo 
by transfection (for example, in the presence of 
calcium phosphate: Chen et al. t 1987, Mole. Cell 
5 Biochem. 7 2745-2752, or of cationic lipids and 
polyamines : Rose et al . , 1991, BioTech. 10 520-525, 
which articles are hereby incorporated by reference) , 
infection, injection, electroporation (Shigekawa et 
al., 1988, BioTech. 6 742-751, which is hereby 

10 incorporated by reference) or any other way so as to 
increase the expression of said synthetic nucleic acid 
sequence in those cells. The modification may be 
mediated by plasmid, bacteriophage, cosmid, viral 
(such as adenoviral or retroviral; Mulligan, 1993, 

15 Science 260 926-932; Miller, 1992, Nature 357 455-460; 

Salmons et al . , 1993, Hum. Gen. Ther. 4 129-141, which 
articles are hereby incorporated by reference) or 
other vectors, or other agents of modification such as 
liposomes (Zhu et al . , 1993, Science 261 209-212, 

20 which is hereby incorporated by reference) , viral 
capsids or nanoparticles (Bertling et al . , 1991, 
Biotech. Appl . Biochem. 13 390-405, which is hereby 
incorporated by reference) , or any other mediator of 
modification. The use of cells as a delivery vehicle 

2 5 for genes or gene products has been described by Barr 

et al., 1991, Science 254 1507-1512 and by Dhawan et 
al., 1991, Science 254 1509-1512, which articles are 
hereby incorporated by reference. Treated cells may 
be delivered in combination with any nutrient, growth 

3 0 factor, matrix or other agent that will promote their 

survival in the treated subject. 
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In yet another aspect, the invention 
provides a pharmaceutical composition comprising the 
synthetic nucleic sequences of the invention and a 
pharmaceutical^ acceptable carrier. 
5 By "pharmaceutically-acceptable carrier" is 

meant a solid or liquid filler, diluent or 
encapsulating substance that may be safely used in 
systemic administration. Depending upon the 

particular route of administration, a variety of 

10 pharmaceutically acceptable carriers, well known in 
the art may be used. These carriers may be selected 
from a group including sugars, starches, cellulose and 
its derivatives, malt, gelatin, talc, calcium sulfate, 
vegetable oils, synthetic oils, polyols, alginic acid, 

15 phosphate buffered solutions, emulsifiers, isotonic 
saline, and pyrogen- free water. 

Any suitable technique may be employed for 
determining expression of the protein from said 
synthetic nucleic acid sequence in a particular cell 

2 0 or tissue. For example, expression can be measured 
using an antibody specific for the protein of interest 
or portion thereof. Such antibodies and measurement 
techniques are well known to those skilled in the art. 

Applications 

25 In one embodiment of the present invention, 

the target cell is suitably a differentiated cell. 
Advantageously, the protein which is desired to be 
selectively expressed in the differentiated cell is 
not expressible in a precursor cell thereof (such as 

30 an undifferentiated or less differentiated cell of the 
mammal) from a parent nucleic acid sequence at a level 
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sufficient to effect a particular function associated 
with said protein. In this embodiment, the step of 
replacing at least one existing codon with a 
synonymous codon is characterized in that the 
5 synonymous codon corresponds to an iso-tRNA which, 
when compared to the iso-tRNA corresponding to the at 
least one existing codon, is in relatively higher 
abundance in the differentiated cell compared to the 
precursor cell. Accordingly, a synthetic nucleic acid 

10 sequence is produced having altered translational 
kinetics compared to the parent nucleic acid sequence 
wherein the protein is expressible in the 
differentiated cell at a level sufficient to effect a 
particular function associated with said protein, but 

15 wherein the protein is not expressible in the 
precursor cell at a level sufficient to effect said 
function. 

As used herein, the term "function" refers 
to a biological, or therapeutic function. 
2 0 The above embodiment may be utilized 

advantageously for somatic gene therapy where 
overexpression of a protein in undifferentiated cells 
such as stems cells has undesirable consequences 
including death or differentiation of the stem cells. 

2 5 In such a case, a suitable protein may include cystic 

fibrosis transmembrane conductance regulator (CFTR) 
protein, and adenosine deaminase (ADA) . 

The differentiated cell may comprise a cell 
of any lineage including a cell of epithelial, 

3 0 hemopoetic or neural origin. For example, the 

differentiated cell may be a mature differentiated 
keratinocyte . 
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Targeting expression of a protein to 
progeny of a stem cell but not to the stem cell itself 
The synthetic nucleic acid sequence 
5 produced above may be transfected directly into the 
differentiated cell for the desired function or 
alternatively, transfected into the precursor cell. 
For example, in the case of ADA deficiency, expression 
of ADA in stem cells may result in loss of stem 
10 phenotype which is undesirable. However, an 

advantageous therapy may reside in transducing 
autologous marrow stem cells with a synthetic nucleic 
acid sequence operably linked to one or more 
regulatory sequences, wherein existing codons of the 
15 wild type ADA gene have been replaced with synonymous 
codons each corresponding to an iso-tRNA expressed in 
relatively high abundance in differentiated 
lymphocytes compared to the marrow stem cells. The 
transduced stem cells may then be reinfused into the 
20 patient. This approach will result in transduced 
marrow stem cells which are not capable of expressing 
ADA themselves, but which are able to give rise to a 
renewable population of differentiated lymphocytes 
which are capable of expressing ADA at levels 
25 sufficient to permit a therapeutic effect. In this 
regard, a suitable cell source for this purpose may 
comprise stem cells isolated as CD34 positive cells 
from a patient's peripheral blood or marrow. For gene 
delivery, a suitable vector may include a retrovirus 
3 0 or Adeno associated virus. 

Alternatively, in the case of inducing cell 
mediated immunity, dendritic cells are important 
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antigen presenting cells (APC) but have a very limited 
life span for antigen presentation once activated of 
between 14 to 21 days. Consequently, dendritic cells 
provide relatively short-term immune stimulation that 
5 may not be optimal. However, in accordance with the 
present invention, a long-term immune stimulation may 
be provided by transducing autologous bone marrow- 
derived CD34 positive dendritic cell precursors with a 
synthetic nucleotide sequence encoding an antigen. 
10 such as the melanoma antigen MART-1, wherein the 
synthetic sequence is operably linked to one or more 
regulatory sequences, and wherein existing codons of a 
wild type nucleotide sequence encoding MART-1 have 
been replaced with synonymous codons each 
15 corresponding to an iso-tRNA expressed in relatively 
high abundance in dendritic cells compared to the 
dendritic cell precursors. The transduced dendritic 
cell precursors may then be reinfused into the 
patient. This approach will result in transduced 
20 dendritic cell precursors which are not capable of 
expressing MART-1 themselves, but which are able to 
give rise to a renewable population of dendritic cells 
which are capable of expressing MART-1 at levels 
sufficient to permit a lifelong intermittent 
25 restimulation of a cytotoxic T lymphocyte (CTL) 
response to the MART-1 antigen. 

Targeting expression of a protein to a stem 
cell but not to progeny of the stem cell 

In an alternate embodiment, the target cell 
may be an undifferentiated cell wherein the protein is 
not expressible in said undifferentiated cell, from a 
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parent nucleic acid sequence encoding the protein, at 
a level sufficient to effect a particular function 
associated with the protein. In such a case, at least 
one existing codon of the parent nucleic acid sequence 
5 is replaced with a synonymous codon corresponding to 
an iso-tRNA which, when compared to the iso-tRNA 
corresponding to the at least one existing codon, is 
in relatively higher abundance in the undifferentiated 
cell compared to a differentiated cell. This results 
10 in a synthetic nucleic acid sequence having altered 
translational kinetics compared to said parent nucleic 
acid sequence wherein the protein is expressible in 
the undifferentiated cell at a level sufficient to 
effect a particular function associated with the 
15 protein, but wherein the protein is not expressible in 
differentiated cells derived from the undifferentiated 
cell at a level sufficient to effect said function. 

This alternate embodiment may, by way of 
example, be used to permit expression of a 
20 transcriptional regulatory protein which when 
expressed in a particular undifferentiated cell or 
stem cell facilitates differentiation of the stem cell 
along a particular cell lineage. It will be 

appreciated that in such a case, the regulatory 
25 protein is normally expressed from a gene in which the 
existing codons correspond to iso-tRNAs which are in 
relatively low abundance in the stem cell compared to 
other iso-tRNAs and that therefore the protein is not 
capable of being expressed at levels sufficient for 
30 commitment of the stem cell to differentiate along a 
particular cell lineage. It will also be apparent 
that such commitment to differentiate along a 
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particular cell lineage may be utilized to prevent 
production of a particular lineage of cells such as 

cancer cells. 

Alternatively, the method according to this 

5 embodiment may be used to express a transcriptional 
regulatory protein that is involved in the production 
of a therapeutic agent or agents. Such a protein may 
include, for example, NF-kappa-B transcription factor 
p65 subunit (NF-kappa-B P 6S) which is involved in the 
10 production of interleukin-2 (IL-2) , interleukin-3 (IL- 
3) and granulocyte and macrophage colony stimulating 
factor (GMCSF) . NF-kappa-B P 65 is encoded naturally by 
a nucleotide sequence comprising a number of existing 
codons each corresponding to an iso-tRNA expressed in 
15 relatively low abundance in stem cells. Accordingly, 
such sequence may be used as the parent nucleic acid 
sequence according to this embodiment. A suitable 
nucleotide sequence encoding this protein is 
described, for example, in Lyle et al (1994, Gene 138 
20 265-266) and in the EMBL database under Accession No 
HSNFKB65A which are hereby incorporated by reference. 

A suitable undifferentiated cell which may 
be utilized in accordance with the present embodiment 
includes but is not limited to a stem cell, such as a 
25 CD34 positive hemopoetic stem cell. 

The present embodiment may also be used 
advantageously for gene therapy where ongoing 
regulated expression of a transgene is desirable. For 
example, secure but reversible regulation of fertility 
30 is desirable in veterinary practice and in humans, 
such regulation may be effected by transducing 
autologous breast ductal epithelial cells with a 
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synthetic nucleic acid encoding a leutinising hormone 
(LH) antagonist or a leutinising hormone releasing 
hormone (LHRH) antagonist under the control of one or 
more regulatory sequences. The synthetic nucleic acid 
5 may be produced by replacing existing codons of a 
parent nucleic acid with synonymous codons 
corresponding to iso-tRNAs expressed in relatively 
high abundance in resting breast ductal epithelial 
cells compared to differentiated cells arising 
10 therefrom. Once the transduced cells are implanted 
back into the patient, expression may be switched off 
by oral administration of progestagen, forcing the 
differentiation of the majority of the stem cells and 
loss of expression of the antagonist. Once pregnancy 
15 is established, the suppression would be self 
sustaining by the naturally produced progestagen. The 
iso-tRNA composition of resting and oestrogen drived 
breast epithelial cells may be established by first 
obtaining resting cells from reduction mammoplasty, 
and determining the cellular tRNA composition in the 
presence and absence of oestrogen. The synthetic 
nucleic acid sequence may be introduced into 
autologous resting epithelial cells by cell 
electroporation ex vivo, and the transduced cells may 
be subsequently transplanted subcutaneously into the 
patient. Progestagen may be administered as required 
to reverse regulation of fertility. 

Targeting expression of a toxin to a tumor 
30 cell but not to any other cells of the mammal 

Many toxins and drugs are available that 
can kill tumor cells. However, these toxins and drugs 
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are generally toxic for all dividing cells. This 
problem may be nevertheless ameliorated by 
establishing the isoacceptor tRNA composition in a 
tumor clone, and constructing a synthetic toxin gene 
5 (e.g., ricin gene) or a synthetic anti-proliferation 
gene (e.g., the tumor supressor p53) using synonymous 
codons corresponding to iso-tRNAs expressed at 
relatively high abundance in the tumor clone compared 
to normal dividing cells of the mammal. The synthetic 
10 gene is then introduced into the patient by suitable 
means to selectively express the synthetic genes in 
tumor cells. 

Alternatively, a chemotherapy enhancing 
product gene (i.e., a drug resistance gene e.g., the 
15 multi-drug resistance gene) using a codon pattern 
unlikely to be expressed in the tumor efficiently may 
be employed. 



Targeting gene therapy to control body fat 
2 0 Lectins are proteins known to control 

satiety. By analogy with animal data, however, if too 
much leptin is administered to a patient, lept in- 
induced starvation might occur. Advantageously, a 
synthetic gene encoding leptin may be constructed 
25 including synonymous codons corresponding to iso-tRNAs 
expressed at relatively high levels in activated 
adipocytes compared to non-activated adipocytes. The 
synthetic gene may then be introduced into the patient 
by suitable means such that leptin is only expressed 
30 substantially in activated adipocytes as opposed to 
non-activated adipocytes. As body fat turnover 
diminishes under the influence of leptin reduced 
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appetite, the metabolic activity of the adipocytes 
falls and the leptin production decreases 
correspondingly . 



5 Targeting expression of a protein to a 

stage of the cell cycle 

In another embodiment of the invention, the 
target cell may be a non-cycling cell. In this case, 
the protein which is desired to be selectively 

10 expressed in the non-cycling cell is expressible in a 
cycling cell of the mammal from a parent nucleic acid 
sequence at a level sufficient to effect a particular 
function associated with the protein. The synonymous 
codons are selected such that each corresponds to an 

15 iso-tRNA which, when compared to the iso-tRNA 
corresponding to the at least one existing codon, is 
in higher abundance in the non-cycling cell compared 
to the cycling cell. Accordingly, a synthetic nucleic 
acid sequence is produced having altered translational 

2 0 kinetics compared to the parent nucleic acid sequence 
wherein the protein is expressible in the non-cycling 
cell at a level sufficient to effect a particular 
function associated with said protein, but wherein the 
protein is not expressible in the non-cycling cell to 

25 effect said function. 

The term "non-cycling cell" as used herein 
refers to a cell that has withdrawn from the cell 
cycle and has entered the GO state. In this state, it 
is well known that transcription of endogenous genes 

30 and protein translation are at substantially reduced 
levels compared to phases of the cell cycle, namely 
Gl, S, G2 and M. 
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By "cycling cell" is meant a cell which is 
in one of the above phases of the cell cycle. 

Expressing a protein in a target cell or 
5 tissue by in vivo expression of iso-tRNAs in the 
target cell or tissue 

In another aspect, the invention extends to 
a method wherein a protein may be selectively 
expressed in a target cell by introducing into the 

10 cell an auxiliary nucleic acid sequence capable of 
expressing therein one or more isoaccepting transfer 
RNAs which are not expressed in relatively high 
abundance in the cell but which are rate limiting for 
expression of the protein from a parent nucleic acid 

15 sequence to a level sufficient for effecting a 
function associated with the protein. In this 
embodiment, introduction of the auxiliary nucleic acid 
sequence in the cell changes the translational 
kinetics of the parent nucleic acid sequence such that 

20 said protein is expressed at a level sufficient to 
effect a function associated with the protein. 

The step of introducing the auxiliary 
nucleic acid sequence into the target cell or a tissue 
comprising a plurality of these cells may be effected 

25 by any suitable means. For example, analogous 
methodologies for introduction of the synthetic 
nucleic acid sequence referred to above may be 
employed for delivery of the auxiliary nucleic acid 
sequence into said cycling cell. 

30 
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Assembly of virus particles in cells which 
do not normally permit assembly of virus particles 

In yet another aspect, the invention 
extends to a method for producing a virus particle in 
5 a cycling eukaryotic cell. The virus particle will 
comprise at least one protein necessary for virus 
assembly, wherein the at least one protein is not 
expressed in the cell from a parent nucleic acid 
sequence at a level sufficient to permit virus 
10 assembly therein. This method is characterized by 
replacing at least one existing codon of the parent 
nucleic acid sequence with a synonymous codon to 
produce a synthetic nucleic acid sequence having 
altered translational kinetics compared to the parent 
15 nucleic acid sequence such that the at least one 
protein is expressible from the synthetic nucleic acid 
sequence in the cell at a level sufficient to permit 
virus assembly therein. The synthetic nucleic acid 
sequence so produced is operably linked to one or more 
2 0 regulatory nucleotide sequences and is then introduced 
into the cell or a precursor cell thereof. The at 
least one protein is expressed subsequently in the 
cell in the presence of other viral proteins required 
for assembly of the virus particle to thereby produce 
25 the virus particle. 

Advantageously, the synonymous codon 
corresponds to an iso-tRNA expressed at relatively 
high level in the cell compared to the iso-tRNAs 
corresponding to the existing codons. 
30 The cycling cell may be any cell in which 

the virus is capable of replication. Suitably, the 
cycling cell is a eukaryotic cell. Preferably, the 

SUBSTITUTE SHEET (RULE 26) 



WO 99/02694 4 0 PCI7AU98/00530 

cycling cell for production of the virus particle is a 
eukaryotic cell line capable of being grown in vitro 
such as, for example, CV-1 cells, COS cells, yeast or 
spodoptera cells. 
5 Suitably, the at least one protein of the 

virus particle are viral capsid proteins. Preferably, 
the viral capsid proteins comprise LI and/or L2 
proteins of papillomavirus. 

The other viral proteins required for 

10 assembly of the virus particle in the cell may be 
expressed from another nucleic acid sequence (s) which 
suitably contain the rest of the viral genome. In the 
case of the at least one protein comprising LI and/or 
L2 of papillomavirus, said other nucleic acid 

15 sequence (s) preferably comprises the papillomavirus 
genome without the nucleotide sequences encoding LI 
and/or L2 . 

In yet a further aspect of the invention, 
there is provided a method for producing a virus 

20 particle in a cycling cell, said virus particle 
comprising at least one protein necessary for assembly 
of said virus particle, wherein said at least one 
protein is not expressed in said cell from a parent 
nucleic acid sequence at a level sufficient to permit 

25 virus assembly therein, and wherein at least one 
existing codon of said parent nucleic acid sequence is 
rate limiting for the production said at least one 
protein to said level, said method including the step 
of introducing into said cell a nucleic acid sequence 

30 capable of expressing therein an isoaccepting transfer 
RNA specific for said at least one codon. 
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In yet a further aspect, the invention 
resides in virus particles resulting from the above 
methods . 

The invention further contemplates cells or 
5 tissues containing therein the synthetic nucleic acid 
sequences of the invention, or alternatively, cells or 
tissues produced from the methods of the invention. 

The invention is further described with 
reference to the following non-limiting examples. 

10 

EXAMPLE I 

Expression of synthetic LI and L2 protein in 
undifferentiated cells. 

15 Maf.priala and Methods 

Codon replacements in the bovine PV (BPV) 
LI and L2 genes 

The DNA and amino acid sequences of the 
wild- type LI (SEQ ID NOS:l,2)and L2 genes (SEQ ID 

20 NOS:5,6) are shown respectively in Figures 1A and IB. 
To determine whether the presence of rare codons in 
wild- type LI (SEQ ID NO:l) and L2 (SEQ ID NO: 5) genes 
(Table 1) inhibited translation, we synthesized the LI 
(SEQ ID NO:3) and L2 (SEQ ID NO:7) genes by using 

25 synonymous substitutions as shown. To construct the 
synthetic sequences, we synthesized 11 pairs of 
oligonucleotides for LI and 10 pairs of 
oligonucleotides for L2 . Each pair of 

oligonucleotides has restriction sites incorporated to 

30 facilitate subsequent cloning (Figures 1A and IB) . 
The degenerate oligonucleotides were used to amplify 
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LI and L2 sequences by PCR using a plasmid with BPV1 
genome as the template. The amplified fragments were 
cut with appropriate enzymes and sequentially ligated 
to pUC18 vector, producing pUCHBLl and pUCHBL2 . The 
5 synthetic LI (SEQ ID NO:3) and L2 (SEQ ID NO:7) 
sequences were sequenced and found to be error- free, 
and then sub-cloned into the mammalian expression 
vector pCDNA3 containing SV40 ori (Invitrogen) , giving 
expression plasmids pCDNA/HBLl and pCDNA/HBL2 . To 
10 compare expression of LI and L2 with that of the 
original sequences, the wild type LI (SEQ ID NO:l) and 
L2 (SEQ ID NO: 5) genes were cloned into the pCDNA3 
vector, resulting in pCDNA/BPVLlwt and pCDNA/BPVL2wt . 

15 Immunofluorescence and Western blot 

staining 

For immunoblotting assays, Cos-1 cells in 
6 -well plates were transfected with 2 ^9 L1 or L2 
expression plasmids using lipof ectamine (Gibco) . 36 

20 hrs after transf ecticn, cells were washed with 0.15M 
phosphate buffered 0.9% NaCl (PBS) and lysed in SDS 
loading buffer. The cellular proteins were separated 
by 10% SDS PAGE and blotted onto nitrocellulose 
membrane. The LI or L2 proteins were identified by 

2 5 electrochemiluminescence (Amersham, UK) , using BPV1 LI 
(DAKO) or L2 -specific (17) antisera. For 
immunofluorescent staining, Cos-1 cells were grown on 
8 -chamber slides, transfected with plasmids, and 
fixed and permeabilised with 85% ethanol 36hr after 

30 transfection. The slides were blocked with 5% milk-PBS 
and probed with LI or L2 -specific antisera, followed 
by FITC-conjugated anti-rabbit IgG (Sigma) . For GFP or 
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PGFP plasmid transfected cells, the cell were fixed 
with 4% buffered formaldehyde and viewed by epi- 
fluorescence microscopy. 



10 



Northern blotting 

Cos cells transfected with various plasmids 
were used to extract cytoplasmic or total RNA using 
the QIAGEN RNeasy mini kit according to the 
supplier's handbook. Briefly, for cytoplasmic RNA 
purification, buffer RLN (50 mM Tris, pH 8.0, 140 mM 
NaCl, 1.5 mM MgCl 2 and 0.5% NP40) was directly added to 
monolayer cells and cells were lysed in 4 °C for 5 min. 
After the nuclei were removed by centrifugation, 
cytoplasmic RNAs were purified by column. For total 
RNA extraction, the monolayer cells were lysed using 
buffer RLT supplied by the kit and RNA was purified by 
spin column. The purified RNAs were separated by 1.5% 
agarose gel in the presence of formaldehyde. The RNAs 
were then blotted onto nylon membrane and probed with 
20 (a) 1:1 mixed 5' -end labelled LI wt and HBLl 
fragments; (b) 1:1 mixed 5' -end labelled L2 wt and 
HBL2 fragments; (c) 1:1 mixed 5 'end labelled GFP and 
PGFP fragments or (d) randomly labelled PAGDH 
fragment. The blots were washed extensively at 65 °C 
25 and exposed to X-ray films for three days. 



15 



Rfiaults 

To test the hypothesis that the codon 
composition of the genes encoding the LI and L2 capsid 
proteins of papillomavirus (PV> contributes to their 
preferential expression in differentiated epithelial 
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cells, we produced synthetic BPV1 LI (SEQ ID NO: 3) and 
L2 (SEQ ID NO:7) genes, substituting codons 
preferentially used in mammalian genes for the codons 
frequently present in the wild type BPV1 LI and L2 
5 sequences which are rare in eukaryotic genes (Figures 
1A, IB) . 

For the LI gene, a total of 202 base 
substitutions were made in 196 codons, without 
changing the encoded amino acid sequence (Figure 1A) . 
10 This synthetic "humanized" BPV LI gene (SEQ ID NO: 3) 
was designated HBL1 . In a similarly modified BPV1 L2 
gene (SEQ ID NO: 7) designated HBL2 , 303 bases were 
changed to substitute 290 less frequently used codons 
with the corresponding preferentially used codons. 
15 Using the synthetic HBL1 (SEQ ID NO: 3) and HBL2 (SEQ 
ID NO: 7) genes, we constructed two eukaryotic 
expression plasmids based on pCDNA3 , and designated 
pCDNA/HBLl and pCDNA/HBL2 . Similar expression 

plasmids, constructed with the wild type BPV1 LI (SEQ 
ID NO si) and BPV1 L2 (SEQ ID NO: 5) genes, were 
designated pCDNA/BPVLlwt and P CDNA/BPVL2wt , 

respectively. In each of these plasmids the SV40 ori 
allowed replication in Cos-1 cells, and the LI or L2 
gene was driven by a strong constitutive CMV promoter. 

To compare the expression of the synthetic 
humanized and the wild type BPV1 LI or BPV1 L2 genes, 
we separately transfected Cos-1 cells with each of the 
LI and L2 plasmids described above. Transfected cells 
were analyzed for expression of LI (SEQ ID N0:2,4) or 
30 L2 (SEQ ID NO:6,8) protein by immunofluorescence 36 hr 
after transfection (Figures 2A and 3A) . Cells 
transfected with the pCDNA3 expression plasmid 



20 



25 
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containing the synthetic humanized LI (SEQ ID NO: 3) or 
L2 (SEQ ID NO: 7) genes were observed to produce large 
amounts of the corresponding protein, while cells 
transfected with expression plasmids with the wild 
5 type LI (SEQ ID NO : 1 ) or L2 (SEQ ID NO: 5) sequences 
produced no detectable LI or L2 protein (Figures 2A 
and 3A, see nuclear staining of LI and L2 proteins). 
To compare more accurately the expression of the 
different LI and L2 constructs, LI and L2 protein 
10 expression was assessed by immunoblot in Cos-l cells 
transfected with the wild type or synthetic humanized 
BPVl LI or L2 pCDNA3 expression constructs (Figures 2B 
and 3B) . Large amounts of immunoreactive LI and L2 
proteins were expressed from the synthetic humanized 
15 LI (SEQ ID NO: 3) and L2 (SEQ ID NO: 7) sequences, but 
no LI or L2 protein was expressed from the wild type 
LI and L2 sequences (SEQ ID NO: 1,5) . 

To establish whether the alterations to the 
primary sequence of the LI and L2 mRNA which resulted 
20 from the codon alterations also affected steady state 
expression of the corresponding message, mRNA was 
prepared from Cos-l cells transfected with the various 
capsid protein gene constructs. Using GAPDH as an 
internal standard it was established by Northern blot 
25 that two to three times more modified than wild type 
LI mRNA, and similar levels of wild type and modified 
L2 mRNA were present in the cytoplasm of transfected 
cells (Figures 2C and 3C) . The amount of LI or L2 
protein expressed per arbitrary unit of LI or L2 mRNA 
30 was at least 100 fold higher for the humanized gene 
constructs than for the natural gene constructs. 
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RXAMPLE 2 

Papillomavirus late protein translation in vitro 

In vitro translation assay- 
One microgram of each plasmid was 
incubated with 20 ,xCi ^-methionine (Amersham) and 40 
^ T7 coupled rabbit reticulocyte or wheat germ 
lysates (Promega) . Translation was performed at 30 °C 
and stopped by adding SDS loading buffer. The LI 
proteins were separated by 10% SDS PAGE and examined 
by autoradiography. 

Production of aminoacyl-tRNA 

2.5 x 10" 4 M tRNA (Boehringer) was added to 
a 20 & reaction containing 10 mM Tris-acetate, 
pH 7 8, 44 mM KC1, 12 mM MgCl J( 9 mM -mercaptoethanol , 
38 nil ATP. 0.25 mM GTP and 7 „L rabbit reticulocyte 
extract. The reaction was carried out at 25 °C for 20 
min, and 30 „L H,0 was added to the reaction to dilute 
the tRNAs to 1 x 10- M. The aminoacyl - tRNAs were then 
aliquoted and stored at -70 °C. 



25 Ppsults 

As the major limitation to expression of 
the wild type BPV LI and L2 genes appeared to be 
translational in our system we wished to test whether 
this limitation reflected a limited availability of 
30 the appropriate tRNA species for gene translation. As 
transient expression of the synthetic genes within 
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intact cells may be regulated by many factors, we 
tested our hypothesis in a cell free system using 
rabbit reticulocyte lysate (RRL) or wheat germ lysate 
to examine gene translation. Similar amounts of 
5 plasmids expressing the wild type or synthetic 
humanized BPV1 LI gene were added to a T7-DNA 
polymerase coupled RRL transcription/translation 
system in the presence of "S-methionine . After 20 
minutes, translated proteins were separated by SDS 
10 PAGE and visualized by autoradiography. Efficient 
translation of the modified LI gene was observed 
(Figure 4, top panel, lane 2), while translation of 
the wild type BPV1 LI sequence resulted in a weak 55 
kDa LI band (Figure 4, upper panel, lane 1). We 
15 reasoned that although the wild type sequence was not 
optimized for translation in RRL, some translation 
would occur as there would be no cellular mRNA species 
competing for the 'rare' codons present in the wild 
type LI sequence. The above data suggest that the 
20 observed difference in efficiency of translation of 
the wild type and synthetic humanized LI genes is a 
consequence of limited availability of the tRNAs 
required for translation of the rare codons present in 
the wild type gene. We therefore expected that 
25 addition of excess tRNA to the in vitro translation 
system would overcome the inhibition of translation of 
the wild type LI gene. To address this question, 10" 5 
M aminoacyl- tRNAs from yeast were added into the RRL 
translation system, and LI protein synthesis was 
30 assessed. Introduction of exogenous tRNAs resulted in 
a dramatic improvement in translation of the wild type 
Ll sequence, which now gave a yield of LI protein 
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comparable to that observed with the synthetic 
humanized LI sequence (SEQ ID NO: 3) (Figure 4, top 
panel) . Enhancement of translation of the wild type 
LI gene (SEQ ID NO:l) by aminoacyl - tRNA was dose- 
5 dependent, with an optimum efficiency at 10* 5 M tRNA. 
As addition of exogenous tRNA improved the yield of LI 
protein translated from the wild type LI gene sequence 
(SEQ ID NO:l), we assessed the speed of translation of 
wild type and humanized LI mRNA. Samples were 
10 collected from the translation mixture every 2 
minutes, starting at the 8th minute. Translation of LI 
(SEQ ID NO: 2, 4) from the wild type sequence {SEQ ID 
NO:l) was much slower than from the humanized LI 
sequence (SEQ ID NO: 3) (Figure 4 bottom panel), and 
15 the retardation of translation could be completely 
overcome by adding exogenous tRNA from commercially 
available yeast tRNA. Yeast tRNA was chosen in the 
above analysis because the codon usage in yeast is 
similar to that of papillomavirus (Table 1) . Addition 
2 0 of exogenous tRNA did not significantly improve the 
translation of the humanized LI gene (SEQ ID NO:3), 
indicating that this sequence was optimized with 
regard to codon usage for the rabbit reticulocyte 
translation machinery (Figure 4, bottom panel). In 

2 5 separate experiments we established that wt LI 

translation could also be enhanced by liver tRNA 
(Figure 4) , and by tRNAs extracted from bovine skin 
epidermis, which presumably constitutes a mixture of 
tRNAs from differentiated and undifferentiated cells 

3 0 (data not shown) . 
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EXAMPLE 3 

Translation of wild type LI is efficient in wheat germ 
5 extract . 

To further test our hypothesis that tRNA 
availability is a determinant of expression of the 
wild type BPV1 LI gene (SEQ ID NO:l), we examined the 
translation of LI in a cell type in which a quite 

10 different set of tRNAs would be available. In a wheat 
germ translation system, wild type LI mRNA was 
translated as efficiently as humanized LI mRNA, and 
addition of exogenous aminoacyl- tRNAs did not improve 
the translation efficiency of either wild type or 

15 humanized sequences (Figure 4 bottom panel) . This 
indicated that in wheat germ there are sufficient of 
the tRNAs which are limiting for translation of wild 
type LI sequence in RRL to allow efficient LI 
translation. 



20 



SAMPLE 4 



Modified late genes can be expressed in 
undifferentiated cells from papillomavirus promoter (s) 

25 While our data presented above indicates 

that translation is limiting for the production of 
BPV1 capsid proteins in our test system, these 
experiments were conducted in systems which are not 
truly representative of the viral late gene 

30 transcription from the BPV genome, in part because the 
genes were driven by a strong CMV promoter. We 
therefore wished to establish whether synthetic 
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humanized BPV capsid protein mRNA would be translated 
more efficiently than the wild type mRNA, if 
transcribed from the natural BPV1 promoter. This 
would establish whether translation was indeed one of 
5 the limiting factors for expression of BPV1 late genes 
driven from the natural cryptic late gene promoter in 
an undifferentiated cell. The BPV genome was cleaved 
at nt 4450 and 6958 with BajriHI/Hindlll and the 
original LI (nt 4186-5595) and L2 (5068-7095) ORFs 

10 were removed. The synthetic humanized L2 gene (SEQ ID 
NO: 7) , together with an SV4 0 ori sequence to allow 
plasmid replication in eukaryotic cells, were inserted 
into the BPV genome lacking L1/L2 ORF sequences. This 
plasmid (Figure 5A) was designated pCICRl. A similar 

15 plasmid was constructed with wild type (SEQ ID NO: 5) 
rather than synthetic humanized L2 and designated 
pCICR2. Cos-1 cells were transfected with these 
plasmids and L2 protein expression examined by 
immunofluorescence of transfected cells. Synthetic 

2 0 humanized L2 (SEQ ID NO: 7), driven by the natural BPV- 

1 promoter, was efficiently expressed, whereas the 
wild type L2 sequence (SEQ ID NO:5) , driven from a 
similar construct, produced no immunoreactive L2 
protein (SEQ ID NO:6,8) (Figure 5B) . As 
25 undifferentiated cells supported the expression of the 
humanized L2 gene (SEQ ID NO: 7) but not the wild type 
L2 (SEQ ID NO: 5) expressed from the cryptic late BPV 
promoter, the results confirmed our earlier 
observations from experiments using the CMV promoter. 

3 0 However, the plasmids tested here contained SV4 0 ori, 

designed to replicate the DNA in Cos cells. The 
increased copy number of the BPV1 L2 plasmids or the 
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transcriptional enhancing activity of the SV40 ori 
might explain in part the increased efficiency of 
expression of L2 in this experimental system when 
compared with infected skin. However, the marked 
5 difference in expression between the natural and 
humanized genes seen with a CMV promoter construct is 
still observed with the natural promoter. 



EXAMPLE 5 

10 

Substitution of papillomavirus-pref erred codons 
prevents translation but not transcription of a non- 
papillomavirus gene in undifferentiated cells. 

Mflfprialg and Methods 

15 Codon replacement in gfp gene 

To construct a modified gfp gene (SEQ ID 
NO:ll) using papillomavirus preferred codons (PGFP) , 6 
pairs of oligonucleotides were synthesized. Each pair 
of oligonucleotides has restriction sites incorporated 

20 and was used to amplify gfp using a humanized gfp gene 
(SEQ ID NO: 9) (GIBCO) as template. The PCR fragments 
were ligated into the pUC18 vector to produce pUCPGFP. 
The PGFP gene was sequenced, and cloned into BairiKI 
site of the same mammalian expression vector, pCDNA3 , 

25 under the CMV promoter. The DNA and deduced amino 
acid sequences of the humanized GFP gene are shown in 
Figures 1C. Mutations introduced into the wild type 
gfp gene (SEQ ID NO: 9) to produce the Pgfp gene (SEQ 
ID NO: 11) are indicated above the corresponding 

30 nucleotides of the wild- type sequence. 
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Results 

To further confirm that codon usage can 
alter gene expression in mammalian cells, we made a 
further variant on a synthetic gfp gene modified for 
5 optimal expression in eukaryotic cells (Zolotukhin, et 
al., 1996. J\ Virol. 70:4646-4654). In our variant, 
codons optimized for expression in eukaryotic cells 
were substituted by those preferentially used in 
papillomavirus late genes. Of 240 codons in the 

10 humanized gfp gene (SEQ ID NO: 9), which expresses high 
levels of fluorescent protein in cultured cells, 156 
were changed to the corresponding papillomavirus late 
gene-preferred codons to produce a new gfp gene (SEQ 
ID NO: 11) designated Pgfp. Expression of Pgfp (SEQ ID 

15 NO: 11) in undifferentiated cells was compared with 
that of humanized gfp (SEQ ID NO: 9) . Cos-1 cells 
transfected with the humanized gfp (SEQ ID NO: 9) 
produced a bright fluorescent signal after 24 hrs, 
while cells transfected with Pgfp (SEQ ID NO: 11) 

2 0 produced only a faint fluorescent signal (Figure 6A) . 
To confirm that this difference reflected differing 
translational efficacy, gfp specific mRNA was tested 
in both trans feet ions and found not to be 
significantly different (Figure 6B.). Thus, codon 

25 usage and corresponding tRNA availability apparently 
determines the observed restriction of expression of 
PV late genes, and modification of codon usage in 
other genes similarly prevents their expression in 
undifferentiated cells. 
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EXAMPLE 6 

PGFP with papillomavirus-pref erred codons is 
5 efficiently expressed in vivo in differentiated mouse 

keratinocytes . 

lvjafftr-ial a and Methods 

Delivery of plasmid DNA into mouse skin by- 
gene gun 

10 Fifty microgram of DNA was coated onto 25 

gold micro-carriers by calcium precipitation, 
following the manufacturer's instructions (Bio-Rad) . 
C57/bl mouse skin was bombarded with gold particles 
coated with DNA plasmid at a pressure of 600 psi. 

15 Serial sections were taken from the skin and examined 
for distribution of the particles, confirming that a 
pressure of 600 psi could deliver particles throughout 
the epidermis. 

Results 

20 Mice were shot with gold beads carrying 

PGFP DNA plasmid and, 24 hrs later, skin samples were 
cut from the site of DNA delivery and examined for 
expression of GFP protein (SEQ ID NO:10,12). 
Fluorescence was detected mostly in upper keratinocyte 

25 layers, representing the differentiated epithelium, 
and was not seen in undifferentiated basal cells. In 
contrast, skin sections shot with the humanized GFP 
plasmid showed fluorescence in cells randomly 
distributed throughout the whole epidermis (Figure 7) . 

30 Although GFP-positive cells were rare in both PGFP- 
(SEQ ID NO: 11) and GFP- inoculated (SEQ ID NO: 9) mouse 
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skin, fluorescence was observed only in differentiated 
strata in the PGFP sample (SEQ ID NO:ll), whereas 
fluorescence was observed throughout the epidermis in 
GFP- inoculated (SEQ ID NO: 9) mouse skin. This result 
5 confirmed that the use of papillomavirus-pref erred 
codons resulted in the protein being expressed in an 
epithelial differentiation-dependent manner. 

EXAMPLE 7 

Microinjection of yeast tRNA and wild type 
LI gene into cultured cells 

To test if yeast tRNA could facilitate 
expression of wild type BPV-1 LI (SEQ ID NO:l) (as 
yeast uses a similar set of codons to those observed 
in papillomavirus for its own genes) , 2 pL of mixtures 
containing tRNA (2 mg/mL) (purified yeast tRNA 
(Boehringer Mannheim) or bovine liver tRNA - control) 
and BPV LI DNA (2 iig/mL) can be injected into CV-1 
cells (Lu and Campisi, 1992, Proc . Natl. Acad, 
Sci. U. S. A. 89 3889-3893). The injected ceils can 
then be cultured for 4 8 hrs at 3 7 °C and examined for 
expression of LI gene by standard immunof luoresence 
methods using BPV Ll-specific antibody and quantified 
by FACS analysis (Qi et al 1996, Virology 216 35-45) . 

EXAMPLE 8 

Establishment of a cell line which can 
30 continuously produce HPV virus particles 

To produce infectious PV, various methods 
have been tried including the epithelial raft culture 
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system (Dollard et al 1992, Genes Dev 6 1131-1142), 
and cell lines containing BPV-1 episomal DNA, and 
infected by BPV-1 L1/L2 recombinant vaccinia (Zhou et 
al 1993, J- Gen. Virol. 74 763-768) or transfected by 
5 SFV RNA (Roden et al 1996, J. Virol. 70 5875-5883). 
The yield of particles is in each case low. In a 
reduction to practice of our discovery, synthetic BPV 
LI (SEQ ID NO:3) and L2 genes (SEQ ID NO: 7) (as 
described in Example 1) can be used to produce 
10 infectious BPV in a cell line containing BPV-1 
episomal DNA. Fibroblast cell lines (CON/BPV) 

containing BPV-1 episomal DNA (Zhou et al 1993, J. 
Gen. Virol. 74 763-768) can be used for transfection 
of the synthetic BPV-1 LI (SEQ ID NO:3) and L2 genes 
15 (SEQ ID NO: 7) under control of CMV promoter. BPV 
particles may then be purified from the cell lysate 
and the purified particles examined for the presence 
of BPV-1 genome. Standard methods such as 

transfection with lipof ectamine (BRL) and G418 
20 selection of transfected cells can be utilized to 
generate suitable transf ectants expressing humanized 
LI (SEQ ID N0:3) and L2 (SEQ ID NO: 7) in the 
background of BPV-1 episomal DNA. Examination of LI 
and L2 protein expression can be performed using 
25 rabbit anti-BPV LI or rabbit anti-BPV L2 polyclonal 
antibodies. BPV particles can then be purified using 
our published methods (Zhou et al 1995, Virology 214 
167-176) and can be characterized by electron 
microscopy and DNA blotting. The infectivity of BPV 
30 particles isolated from the cultured cells may be 
tested in focus formation assays using C127 
fibroblasts . 
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BXaMPT.B 9 

Method for extracting and measuring tRNA from tissues 
Tissue (lOOg) is homogenized in a Waring 
5 Blender with 150 mL of phenol (Mallinckrodt , 
Analytical Reagent, 88%) saturated with water (15:3) 
and 150 mL of 1.0 M NaCI, 0.005 M EDTA in 0.1 M Tris- 
chloride buffer, pH 7.5. The homogenate was spun 
for ten minutes at top speed in the International 
10 clinical centrifuge and the upper layer was carefully 
decanted off. To this aqueous layer, three volumes of 
95% ethanol were added. The resultant precipitate was 
spun down at top speed in the International clinical 
centrifuge and resuspended in 250 mL of 0.1 M 
15 Tris/chloride buffer, pH 7.5. This solution was added 
(flow rate of 15-20 drops per minute) to a column (2 x 
10 cm) of 2 g of DEAE-cellulose previously 
equilibrated with cold 0.1 M Tris-chloride buffer pH 
7.5. The column was then washed with 1 L of Tris- 
20 chloride buffer, pH 7.5 and the RNA eluted with 1.0 M 
NaCI in 0.1 M Tris-chloride buffer, pH 7 . 5 . The first 
10 mL of NaCI solution were discarded as "hold-up." 
Sufficient salt solution (60-80 mL) was then collected 
until the optical density of the effluent was less 
than three at 260 nm. This solution was extracted 
twice with an equal volume of phenol saturated with 
water and twice with ether. To the aqueous solution 
containing the RNA, three volumes of 95% ethanol were 
added and the solution wag allowed to stand overnight 
30 in the cold. The precipitate was spun down and washed 
first with 80% and then twice with 95% ethanol and 



25 
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dried in a vacuum. Approximately 60 mg of soluble 
RNA were obtained from a 100-g lot of rat liver. 



10 



Quantitating tRNAs 

The following nylon membranes are used: 
Biodine A and B (PALL) . For the preparation of dot 
blots, the tRNA samples (from 1 pg to 5 ng) are 
denatured at 60 °C for 15 min in 1-5 .L of 15% 
formaldehyde. lOx SSC (SSC is NaCl 0.3 M, tri-sodium 
citrate 0.03 N) . The samples are spotted m 1 ^ 
aliquots onto the membranes that have been soaked for 
15 min in deionized water and slightly dried between 
two sheets of 3 MM Whatman paper prior to the 
application of the samples. The tRNAs are fixed 
5 covalently (in the membranes by ultraviolet- 
irradiation (10 mm using an ultraviolet lamp at 254 
nm and 100 W strength at a distance of 20 cm) and the 
membranes are baked for 2-3 h at 80 °C. 

A 5' end labelled synthetic deoxyribo- 

. *.v„» seauence 
0 oligonucleotide complementary uu uhe a.- 

of the tRNA is used as a probe for the hybridization 
experiments. Labelling of the oligonucleotide is 
performed by direct phosphorylation of the 5' OH' 
ended probe. 

l5 F or hybridisation experiments, the Un- 

irradiated membranes are first preincubated for 5 h at 
50 C in 50% deionized formamide, 5 x SSC, 1% SDS, 
0.04% Ficoll 0.04% polyvinylpyrrolidone and 250 jiL/mL 
of sonicated salmon sperm DNA using 5 mL of buffer for 

30 100 cm 2 of membrane. Hybridization is finally 
performed overnight at 50 °C in the above solution 
(2 .5 mL/100 cm') where the labeled probe has been 
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10 



15 



20 



adde d. «c« hybridization, the — «n» « <— * 

. , x SSC. 0.1% SDS tor 5 mi- at room 
twice m 2 x bbu, u 

, 9 x gsc 1% SDS for 30 mm at 60 
temperature, twice m 2 x SbL, 

. A , v SSC 0 1% SDS for 3 0 min at 
o C and finally in 0.1 x SSC. 

r.moerature To detect the hybridized probes the 
room temperature. ^ ^ ec 

menibranes are exposed for 16 h to Fu D 
with an intensifying screen. 

Sequence of tRNA probes 

~f the tRNA probes are as 
The sequences of tne w« f 



follows: 
Ala 0 ": 
Arg"* : 



AAC 



25 



Asri 
Asp 0 * 6 "- 

Csy™: 
Glu^ : 
Gln^: 
Gly 00 *: 

Ue MC : 
Leu CTA : 
heu m : 
Lys^: 

_AAG . 



30 



Lys' 
Met elon9 
Phe™: 
Pro** : 
Pro* 1 : 
ser^: 

Tyr TAC : 



5' - 
5' ■ 
5' 
5' 
5' 
5' 
5' - 
5' - 
5' ■ 
5' 
5' 
5' 
5' 
5' - 
5' - 
5' ■ 
5' 
5' 
5' 
5' 
5' 



TAAGGACTGTAAGACTT 
• CGAGCCAGCCAGGAGTC 

- CTAGATTGGCAGGAATT 

- TAAGATATATAGATTAT 
- AAGTCTTAGTAGAGATT 

- T ATTT CTACACAGC ATT 
_ CTAGGACAATAGGAATT 
-TACTCTCTTCTGGGTTT 

TGCCGTGACTCGGATTC 
■ TAGAAATAAGAGGGCTT 
- TACTTTTATTTGGATTT 

- TATTAGGGAGAGGATTT 

- TCACTATGGAGATTTTA 
-CGCCCAACGTGGGGCTC 
-TAGTACGGGAAGGATTT 

- TGTTTATGGGATACAAT 
TCAAGAAGAAGGAGCTA 
■ GGGCTCGTCCGGGATTT 

- AT AAGAAAGGAAGAT CG 

- TGTCTTGAG AAGAGAAG 
- TGGTAAAAAGAGGATTT 



(SEQ 
(SEQ 
(SEQ 
(SEQ 
(SEQ 
(SEQ 
(SEQ 
(SEQ 
(SEQ 
(SEQ 
(SEQ 
(SEQ 
(SEQ 
(SEQ 
(SEQ 
(SEQ 
(SEQ 
(SEQ 
(SEQ 
(SEQ 
(SEQ 



ID NO: 13) 
ID NO: 14) 
ID NO: 15) 
ID NO:l6) 
ID NO: 17) 
ID NO: 18) 
ID NO: 19) 
ID NO:20) 
ID NO; 21) 
ID NO:22) 
ID NO: 23) 
ID NO: 24) 
ID NO:25) 
ID NO: 26) 
ID NO: 27) 
ID NO:28) 
ID NO:29) 
ID NO:30) 
ID NO: 31) 
ID NO: 32) 
ID NO:33) 
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Val GTA. 5' -TCAGAGTGTTCATTGGT (SEQIDNO:34) 



ky&MPT.E 10 



5 



10 



15 



20 



25 



Comparison of the relative abundance of tRNA species 
in undifferentiated and differentiated keratinocytes 

trials and Methods 

Isolation of epidermal cells 

2-day old mice were killed and their skins 
removed. The skins were digested with 0.25% trypsin 
PBS at 4 °C overnight. The epidermis was separated 
from the dermis using forceps and minced with scissors 
in 10% FCS DMEM medium. The cell suspension was first 
filtered through a 1 mm and then a 0.2 mm nylon net. 
The cell suspension was then pelleted and washed twice 
with PBS. 

Density gradient centrifugation 
The keratinocytes were resuspended in 30% 
Percoll and separated by centrifugation through a 
discontinuous Percoll gradient (1.085, 1.075 and 1.050 
g/mL) at 1200 x g at room temperature for 25 rain. The 
cells were then washed with PBS and used to extract 
tRNA. 



tRNA purification 

The cells were lysed in 5 mL of lysis 
buffer (0.2 M NaOH, 1% SDS) for 10 min at room 
temperature. The lysate was neutralized with 5 mL of 
30 3.0 M potassium acetate (pH 5.5). After 
centrifugation, the supernatant was diluted with 3 
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volumes of 100 mM Tris (pH 7.5) and added to a DEAE 
column equilibrated with 100 mM Tris (pH 7.5). An 
equal volume of isopropanol was added to the aqueous 
solution containing tRNA, and the solution was allowed 
5 to stand overnight at 4 °C. The tRNA was spun down 
and washed with 75% ethanol, then dissolved in RNase- 
free water. 



10 



tRNA blotting 

10 ng of each tRNA sample in 1 v-h was 
denatured in 60°C for 15 min in 4 /iL formaldehyde and 
5 M L 20 x SSC. The samples were spotted in 1 M L 
aliquots onto charged nylon membrane (Amersham) , and 
the tRNAs were fixed with UV and probed with »P- 
15 oligonucleotides. 

Comparison of the abundance of the tRNA 
species in undifferentiated and differentiated 
keratinocytes showed that the levels of some tRNA 
populations changed dramatically. For example, the 
levels of tRNAs specific for Ala", Leu CTT , Leu™ were 
increased in differentiated cells while tRNAs for 
Arg"*, Pro" 1 , Asn** 0 were more abundant in 
undifferentiated keratinocytes (see Table 2) . 



20 



25 



30 



r.pfrTBP&T, DTSnTSSION 

in the present specification the inventors 
have confirmed that one determinant of the efficiency 
of translation of a gene in mammalian cells is its 
codon composition. This observation has commonly been 



SUBSTITUTE SHEET (RULE 26) 



PCT/AU98/00530 

WO 99/02694 61 

ma de when genes from prokaryotic organisms have been 
expressed in eukaryotic cells (Smith, D. W., 1996, 
Biotechnol. Prog. 12:417-422). The present inventors 
have also presented evidence that mRNA encoding the 
5 capsid genes of papillomavirus are not effectively 
translated in cultured eukaryotic cells, apparently 
because tRNA availability is rate limiting for 
translation, and that the block to PV late gene 
translation in eukaryotic cells in culture can be 
10 overcome by altering the codon usage of the late genes 
to match the consensus for mammalian genes, or 
alternatively by providing exogenous tRNAs . 
Alterations to mRNA secondary structure or protein 
binding (Sokolowski, et al . . 1998, J. Virol. 72:1504- 
15 1515) as a consequence of the changes to the primary 
sequence of the PV capsid genes might contribute to 
the observed differences in efficiency of translation 
of the natural and modified PV capsid gene mRNAs in 
cultured cells. However, the enhancement of 

20 translation of the natural but not the modified mRNA 
that was observed after addition of tRNA in a 
mammalian in vitro translation system, which was not 
observed in a plant translation system, strengthens 
the argument that tRNA availability is rate limiting 
25 for translation of the natural gene in mammalian 
cells A shortage of critical tRNAs could result m 
slowed elongation of the nascent peptide or premature 
termination of translation (Oba, et al.. 1991. 
Biochimie 73:1109-1112). Slowed elongation appears to 
30 be the major consequence for the PV late gene. 
Analysis of codon usage in the PV genome shows that PV 
late genes use many codons that mammalian cells rarely 
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use. For example, PV frequently uses UUA for leucine, 
CGU for arginine, ACA for threonine, and AUA for 
isoleucine, whereas these codons are significantly 
less often used in mammalian genes. In contrast, 
5 papillomavirus late genes can be expressed efficiently 
in yeast (Jansen, et al . , 1995, Vaccine 13:1509-1514) 
(Sasagawa, et al . , 1995, Virology 206:126-135) and the 
codon composition of yeast and papillomavirus genes 
are similar (Table 1) . An apparent exception is that 

10 PV LI genes can be efficiently expressed in insect 
cells (Kirnbauer, et al., 1992, Proc. Natl. Acad. Sci . 
USA 89:12180-12184) using recombinant baculovirus, or 
in various undifferentiated mammalian cells using 
recombinant vaccinia (Zhou, et al., 1991, Virology 

15 185:251-257). As infection with vaccinia or 

baculovirus down regulates cellular protein synthesis, 
the efficient expression of the LI capsid proteins 
under these circumstances may occur because less 
cellular mRNA is available in a virus infected cell to 

20 compete with the LI mRNA for the rarer tRNAs. 

Codon composition could be a more general 
determinant of gene expression within different stages 
of differentiation of the same tissue. Although the 
genetic code is essentially universal, different 

25 organisms show differences in codon composition of 
their genes, while the codon composition of genes 
tends to be relatively similar for all genes within 
each organism, and matched to the population of iso- 
tRNAs for that organism (Ikemura, T., 1981, J . Mol. 

30 Biol. 146:1-21). However, populations of tRNAs in 
differentiating and neoplastic cells are different 
(Kanduc, D. , 1997, Arch. Biochem. Biophys. 342:1-6; 
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Yang, and Comb, 1968, J. Mol . Biol. 31:138-142; Yang, 
and Novel li, 1968, Biochem. Biophys. Res. Commun. 31: 
534-539) and the tRNA populations also vary in cells 
growing under different growth conditions (Doi, et 
5 al., 1968, J. Biol. Chem. 243:945-951). Accordingly, 
the inventors believe that codon composition and tRNA 
availability together provide a primitive mechanism 
for spatial and/or temporal regulation of gene 
expression. It is well recognized that the G+C 

10 content of many dsDNA viruses, a crude marker for 
viral gene codon composition, is markedly different 
from the G+C content of the DNA of the cells they 
infect (Strauss, et al., 1995, "Virus Evolution" in 
Virology (eds. Fields, B. N . , et al.) , Lipipincott- 

15 Raven, Philadelphia, pp 153-171) . Viruses may 

therefore have evolved to take advantage of codon 
composition to regulate their own program of gene 
expression, perhaps to avoid expression of lethal 
quantities of viral proteins in undifferentiated cells 

2 0 where the virus utilizes the cellular machinery to 
replicate its genome. 

As the inventors' observations represent an 
apparently novel mechanism of regulation of gene 
translation within a single tissue, it is relevant to 

2 5 consider how this relates to previously proposed 
hypotheses for the restriction of expression of PV 
late genes to differentiated epithelium. A number of 
explanations have been proposed for the observation 
that PV late genes are only effectively expressed in 

30 differentiated epithelium. Reduced late gene 

transcription may reflect dependence of transcription 
from the late promoter on transcription factors 
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expressed only in differentiated epithelium, or may 
alternatively be due to suppression of late promoter 
transcription by viral (Stubenrauch, et al., 1996, J . 
Virol, 70:119-126) or cellular gene products expressed 
5 in undifferentiated cells. The "late" promoters of 
HPV31b and of HPV5 (Haller, et al . , 1995, Virology 
214:245-255; Hummel, et al . , 1992, J. Virol. 66:6070- 
6080) are described as differentiation dependent, 
although the search for relevant transcription control 

10 factors in differentiated keratinocytes by 
conventional footprinting and DNA binding studies has 
to date been unrewarding. Our data show that capsid 
proteins are not translated from PV LI and L2 mRNAs in 
cells transfected with CMV promoter-based expression 

15 vectors (Fig. 2) , suggesting that in addition to any 
transcriptional controls that may exist that there is 
a post-transcriptional block to capsid protein 
synthesis in undifferentiated cells. Sequences 
resembling 5 1 splice donor sites exist within LI or L2 

20 mRNA or within flanking untranslated message which are 
inhibitory to transcription of genes with which they 
are associated (Kennedy, et al., 1991, J . Virol. 
65:2093-2097) (Furth, et al . , 1994, Mol . Cell. Biol. 
14:5278-5289). Other AU rich sequences in LI or L2 

25 mRNA promote mRNA degradation (Sokolowski, et al . , 
1997, Oncogene 15:2303-2319). These mechanisms 

inhibiting LI and L2 expression in undifferentiated 
cells have yet to be shown to be inactive in 
differentiated epithelium, to explain the successful 

30 translation of late genes in this tissue. 

Because inhibitory RNA sequences within the 
LI coding sequence could have been rendered non- 
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functional by the systematic codon substitution 
employed in the experiments described herein and the 
untranslated inhibitory sequences were not included in 
the inventors' test system, the respective roles of 
5 inhibitory sequences and codon mismatch in suppression 
of PV late gene expression in cultured mammalian cells 
cannot be determined. However, regulatory sequences 
promoting RNA degradation or inhibiting translation 
are presumed to act through interaction with nuclear 

10 or cytoplasmic proteins (Sokolowski , et al., 1998, J. 
Virol. 72:1504-1515), and inefficient translation of 
native sequence LI mRNA was observed in a cell free 
translation system from anucleate cells, demonstrating 
that codon composition of the PV late genes must play 

15 some role in regulation of PV late gene translation. 

Further evidence supporting the hypothesis 
that codon composition is an important determinant of 
PV capsid gene expression was gathered from an 
analysis of the 84 PV LI sequences currently available 

2 0 in Genebank. The codon composition of the LI genes, 
and particularly the frequency of usage of the rarer 
codons, was essentially the same across all the 
published sequences (data not shown) as would be 
predicted by the similar G+C content of the 

25 papillomavirus genomes. The PV LI gene is relatively 
conserved at the amino acid level, showing 60 - 80% 
amino acid homology between PV genotypes, as might be 
expected by the constraints on capsid protein 
function. There are, however, no obvious constraining 

30 influences on the codon composition of the PV late 
genes beyond those of the inventors' hypothesis, as 
the late gene region does not code for other genes, 
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either in other reading frames or on the complementary 
DNA strand, and has no known cis acting regulatory- 
functions. If codon composition of the capsid genes 
were not important for PV function, a considerable 
5 heterogeneity of codon usage might therefore be 
expected, given the evolutionary diversity of PVs 
(Chan, et al. 1995, J. Virol. 69:3074-3083). 

Taken together, the data and evidence 
outlined herein makes a strong case that codon usage 

10 is a significant determinant of expression of PV late 
genes in undifferentiated and differentiated 
epithelial cells, and that this observation is 
generalizable . The relative role of message 

instability and codon mismatch in determining 

15 expression in differentiated tissues will require 
comparisons of transcriptional activity and 
translation of the LI or L2 genes driven from strong 
constitutive promoters in differentiated and 
undifferentiated epithelium. Such work should now be 

2 0 feasible using either transgenic technology or 
keratinocyte raft cultures. 

Although mechanisms of transcriptional 
regulation of PV LI or L2 gene expression in the 
superficial layer of differentiated epithelium have 

25 been proposed (Zeltner et al., 1994, J. Virol. 

68:3620; Brown, et al . , 1995, Virology 214:259; Stoler 
et al., 1992, Hum. Pathol. 23:117; Hummel et al., 
1995, J. Virol. 69:3381; Haller et al . , 1995, Virology 
214:245; Barksdale and Baker, 1993, J. Virol. 

30 67:5605), measurable PV late gene mRNA is not always 
associated with production of late proteins {Zeltner 
et al., 1994, supra; Ozbun and Meyers, 1997, J. Virol. 
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71:5161), and the data presented here suggest that 
translation regulation may play a major part in 
controlling PV late gene expression. This observation 
has implications as herein described for the 
5 regulation of expression of genes related to the 
specialised functions of any differentiated tissue, 
and also for targeting of expression of therapeutic 
genes to such tissue while avoiding the potentially 
deleterious consequences of expression of the 
10 exogenous gene in a self renewing stem cell 
population. 

The present invention has been described in 
terms of particular embodiments found or proposed by 
the present inventors to comprise preferred modes for 

15 the practice of the invention. Those of skill in the 
art will appreciate that, in light of the present 
disclosure, numerous modifications and changes may be 
made in the particular embodiments exemplified without 
departing from the scope of the invention. All such 

20 modifications are intended to be included within the 
scope of the appended claims. 
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TABLE 1 

The codon usage data for human, cow yeast 
and wheat proteins are derived from published 
results (18). The BPV1 data are from the sequences in 
the Genbank database . 



TABLE 2 

Each iso-acceptor tRNA with anticodon shown 
10 as superscript are shown on top row. The 
indicates the abundance of tRNA wherein each 
indicates about 10 fold increase. 
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TABLS 1 

Frequency (per one thousand) of codon usage for individual 



organi sms « 



Amino 


Codons 


Human 


Cow 


Van nt" 

X C3 C& 0 w 


Wheat 


BPVL1/ 


acids 












L2 


ARG 


CGA 


5 .4 


5 . b 


9 

Z . J 


2 . 3 


7.2 




CGC 


11 . 3 


12 . z 


Z . \J 


7 . 5 


4 .1 




CGG 


10 . 4 


I i 9 

II . z 


1 1 
X • X 


4 . 6 


5.1 




CGU 


4 . 7 


3 . 7 


/ . j 


1 1 


10 .4 




AGA 


9 . 9 


9 . y 


9 A n 

Z*t . \J 


4 . 1 


14 .4 




AGG 


11 . 1 


11.4 


n c 
/ . D 


7 1 


9.3 


LEU 


CUA 


6.2 


4 . 9 


11 ft 
11 . o 


12 1 


18 . 6 




cue 


19 . 9 


21 . 2 


A 1 
*l . X 


18 . 6 


6.2 




CUG 


42 . 5 


46.6 


O . J 


15 5 


15.5 




cuu 


10.7 


10 . 6 


9 . o 


D . -J 


20 .7 




UUA 


5.3 


4 . 0 


z4 . J 




14 . 5 




UUG 


11.0 


9.6 


32.1 


15 . 3 


1 c c 
1 b . b 


SER 


UCA 


9.3 


7.6 


15.6 


14.6 


16.6 




UCC 


17.7 


17.6 


14 .4 


10.1 


11.4 




UCG 


4.2 


4.5 


6.5 


9.6 


6.2 




UCU 


13 .2 


11.2 


24.6 


14.8 


15.5 




AGC 


18.7 


18.7 


7.1 


12.8 


12 .4 




AGU 


9.4 


8.6 


11.7 


12.9 


21.7 


THR 


ACA 


14.4 


11.4 


15.6 


4.6 


37.3 




ACC 


23.0 


21.1 


13.9 


15.9 


19.7 




ACG 


6.7 


7.8 


6.7 


4.5 


4.1 




ACU 


12.7 


9.6 


22.0 


11.8 


28.0 
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Amino 
acids 

PRO 

ALA 

GLY 

VAL 

LYS 

ASN 

GLN 

HIS 

GLU 

ASP 



Codons 



CCA 
CCC 
CCG 

ecu 

GCA 

GCC 

GCG 

GCU 

GGA 

GGC 

GGG 

GGU 

GUA 

GUC 

GUG 

GUU 

AAA 

AAG 

AAC 

AAU 

CAA 

CAG 

CAC 

CAU 

GAA 

GAG 

GAC 

GAU 



Human 

14.6 

20.0 

6.5 

15.5 

14 .0 

29.1 

7.2 

19.6 

17.1 

25.4 

17.3 

11.2 

5.9 

16.3 

30.9 

10.4 

22.2 

34.9 

22.6 

16.6 

11.1 

33.6 

14.2 

9.3 

26.8 

41.4 

29.0 

21.7 



70 
Cow 

12 .0 

19.2 

7.9 

14.6 

13.1 

35.8 

9.3 

19.1 

16.2 

28 .1 

19.2 

11.8 

5.1 

18.4 

32.9 

9.9 

21.6 

37.1 

22.4 

12.5 

9.7 

34 .4 

14 .0 

7.5 

24 .4 

45.4 

31.5 

19.2 



Yeast 



21.4 

5.9 

4 . 1 

12.8 

15.3 

15.5 

5.1 

28.3 

8.9 

8.9 

5.1 

34.9 

10.0 

14.9 

9.5 

26.6 

37.7 

35.2 

25.8 

31.4 

29.8 

10.4 

8.2 

12.3 

48.9 

16.9 

22.3 

37.0 



Wheat 



71.2 
11.1 
19.4 
10.3 
11.2 
19.5 
13.8 
9.6 
25.9 
28.0 
28 .5 
9.6 
4.4 
14.8 
12.9 
11.6 
4.5 
17.4 
14.2 
6.7 
171.8 
79.4 
8.2 
7.1 
7.8 
19.7 
13.0 
4.0 
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BPVL1/ 
L2 

22.8 
15.5 
0.0 
33.1 
33.1 
17.6 
4.1 
13 .5 
22.8 
12.4 
22.8 
18.6 
15.5 
6.2 

23 .8 
16.6 
37.2 
13.5 
10.3 

24 .8 
22.8 
17.6 
6.2 
13.4 
36.2 
21.7 
18.6 
33.1 
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Amino 


Codons 


Human 


Cow 


Yeast 


Wheat 


BPVLl 


acids 












L2 


TYR 


UAC 


18.8 


20.3 


16.5 


24.5 


17.6 




UAU 


12.5 


10.5 


16.5 


12.5 


18.6 




UGC 


14.5 


13 .9 


3.7 


14.8 


5.2 




TTf^TJ 
UuU 


9.9 


9.4 


7.6 


4.9 


5.2 


PHE 


uuc 


22.6 


25.5 


20.0 


14.1 


7.2 




uuu 


15.8 


17.0 


23 .2 


15.0 


23.8 


ILE 


AUA 


5.8 


5.2 


12.8 


5.4 


22 .7 




AUC 


24.3 


25.8 


18.4 


19.7 


8.2 




AUU 


14.9 


13.1 


31.1 


10.7 


20.7 
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TABLE 2 



tRUA population changes as KC starts to differen 



tRNA 


Arg CGA 


Ala 0 " 


His 0 * 


Leu m 


Leu CTA 


Lys^ 


Lys^ 


Met Ini 


Pro CCI 


blip la 




+ + + 


+ 


+++ 


+++ 


++ 


+ 


+ 




Basal 


+++ 


+ 


+ + 


+ 


+ 


+ 


+ 


++ 


+++ 






















tRNA 


Val 0TX 


Val OTI 


His^ 


Asn" 0 


Thr ACA 


Met sl ° 


Gly 00 * 






Supra 


+ + 


+ 


++ 


+ 




+ 


+ 






Basal 


+ 


+ 


+ 


+++ 


+ 


++ 


+ 
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WHAT IS CLAIMED IS; 



1. A synthetic nucleic acid sequence capable 
of selectively expressing a protein in a target cell 
5 or tissue of a mammal, wherein said selective 
expression is effected by replacing at least one 
existing codon of a parent nucleic acid sequence with 
a synonymous codon to form said synthetic nucleic acid 
sequence . 

10 2. The nucleic acid sequence of claim 1, 

wherein said synonymous codon corresponds to an iso- 
tRNA which, when compared to an iso-tRNA corresponding 
to the at least one existing codon, is in higher 
abundance in the target cell or tissue relative to one 

15 or more other cells or tissues of the mammal. 

3. The nucleic acid sequence of claim 1, 
wherein said synonymous codon corresponds to an iso- 
tRNA which, when compared to an iso-tRNA corresponding 
to the at least one existing codon, is in higher 

20 abundance in the target cell or tissue relative to a 
precursor cell or tissue. 

4. The nucleic acid sequence of claim 1, 
wherein said synonymous codon corresponds to an iso- 
tRNA which, when compared to an iso-tRNA corresponding 

25 to the at least one existing codon, is in higher 
abundance in the target cell or tissue relative to a 
cell or tissue derived therefrom. 

5. The nucleic acid sequence of claim 1, 
wherein said synonymous codons for selective 

3 0 expression of said protein are selected from the group 
consisting of gca (Ala) , cuu (Leu) and cua (Leu) , and 
said target is a differentiated cell. 
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6. The nucleic acid sequence of claim 5, 
wherein said differentiated cell is a differentiated 
keratinocyte . 

7. The nucleic acid sequence of any one of 
5 claims 2 to 4 , wherein said corresponding iso-tRNA in 

said target cell or tissue is at a level which is at 
least 110%, preferably at least 200%, more preferably 
at least 500%, and most preferably at least 1000%, of 
that expressed in the or each other cell or tissue of 

10 the mammal. 

8. The nucleic acid sequence of claim 1, 
wherein the synonymous codon may be selected from the 
group consisting of (1) a codon used at relatively 
high frequency by genes, preferably highly expressed 

15 genes, of the target cell or tissue, (2) a codon used 
at relatively high frequency by genes, preferably 
highly expressed genes, of the or each other cell or 
tissue, (3) a codon used at relatively high frequency 
by genes, preferably highly expressed genes, of the 
20 mammal, (4) a codon used at relatively low frequency 
by genes of the target cell or tissue, (5) a codon 
used at relatively low frequency by genes of the or 
each other cell or tissue, (6) a codon used at 
relatively low frequency by genes of the mammal, (7) a 
25 codon used at relatively high frequency by genes of 
another organism, and (8) a codon used at relatively 
low frequency by genes of another organism. 

9. The nucleic acid sequence of claim 1, 
wherein the at least one existing codon and the 
3 0 synonymous codon are selected such that said protein 
is expressed from said synthetic nucleic acid sequence 
in said target cell or tissue at a level which is at 
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least 110%, preferably at least 200%, more preferably 
at least 500%, and most preferably at least 1000%, of 
that expressed from said parent nucleic acid sequence 
in said target cell or tissue. 
5 10. A method for selectively expressing a 

protein in a target cell or tissue of a mammal, 
wherein said selective expression is effected by 
replacing at least one existing codon of a parent 
nucleic acid sequence with a synonymous codon to form 
10 said synthetic nucleic acid sequence. 

11. The method of claim 10, wherein said method 
is further characterized the steps of: 

(a) replacing at least one existing codon 
of a parent nucleic acid sequence encoding said 

15 protein with a synonymous codon to produce a synthetic 
nucleic acid sequence having altered translational 
kinetics compared to said parent nucleic acid sequence 
such that said protein is selectively expressible in 
said target cell or tissue; 

2 0 (b) administering to the mammal and 

introducing into said target cell or tissue, or a 
precursor cell or precursor tissue thereof, said 
synthetic nucleic acid sequence operably linked to one 
or more regulatory nucleotide sequences; and 

25 (c) selectively expressing said protein 

in said target cell or tissue. 

12. The method of claim 11 further including, 
prior to step (a) : 

(i) measuring relative abundance of 
30 different iso-tRNAs in said target cell or tissue, and 
in one or more other cells or tissues of the mammal; 
and 
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(ii) identifying said at least one 
existing codon and said synonymous codon based on said 
measurement, wherein said synonymous codon corresponds 
to an iso-tRNA which, when compared to an iso-tRNA 
5 corresponding to the existing codon, is in higher 
abundance in said target cell or tissue relative to 
the or each other cell or tissue of the mammal. 

13. The method of claim 12, wherein step (ii) 
is further characterized in that said synonymous codon 

10 corresponds to an iso-tRNA which, when compared to an 
iso-tRNA corresponding to the at least one existing 
codon, is in higher abundance in the target cell or 
tissue relative to a precursor cell or tissue. 

14. The method of claim 12, wherein step (ii) 
15 is further characterized in that said synonymous codon 

corresponds to an iso-tRNA which, when compared to an 
iso-tRNA corresponding to the at least one existing 
codon, is in higher abundance in the target cell or 
tissue relative to a cell or tissue derived therefrom. 

20 15. The method of claim 11 further including, 

prior to step (a) , identifying said at least one 
existing codon and said synonymous codon based on 
respective relative frequencies of particular codons 
used by genes selected from the group consisting of 

25 (I) genes of the target cell or tissue, (II) genes of 
the or each other cell or tissue, (III) genes of the 
mammal, and (IV) genes of another organism. 

16. A method for expressing a protein in a 
target cell or tissue from a first nucleic acid 

30 sequence including the steps of: 

introducing into said target cell or 
tissue, or a precursor cell or precursor tissue 
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thereof, a second nucleic acid sequence encoding at 
least one isoaccepting transfer RNA wherein said 
second nucleic acid sequence is operably linked to one 
or more regulatory nucleotide sequences, and wherein 
5 said at least one isoaccepting transfer RNA is 
normally in relatively low abundance in said target 
cell or tissue and corresponds to a codon of said 
first nucleic acid sequence. 

17. A method for producing a virus particle in 

10 a cycling eukaryotic cell, said virus particle 
comprising at least one protein necessary for assembly 
of said virus particle, wherein said at least one 
protein is not expressed in said cell from a parent 
nucleic acid sequence at a level sufficient to permit 

15 virus assembly therein, said method including the 
steps of: 

(a) replacing at least one existing codon 
of said parent nucleic acid sequence with a synonymous 
codon to produce a synthetic nucleic acid sequence 

2 0 having altered translational kinetics compared to said 

parent nucleic acid sequence such that said at least 
one protein is expressible from said synthetic nucleic 
acid sequence in said cell at a level sufficient to 
permit virus assembly therein; 
25 (b) introducing into said cell or a 

precursor thereof said synthetic nucleic acid sequence 
operably linked to one or more regulatory nucleotide 
sequences ; and 

(c) expressing said at least one protein 

3 0 in said cell in the presence of other viral proteins 

required for assembly of said virus particle to 
thereby produce said virus particle. 
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18. A method for producing a virus particle in 
a cycling cell, said virus particle comprising at 
least one protein necessary for assembly of said virus 
particle, wherein said at least one protein is not 

5 expressed in said cell from a parent nucleic acid 
sequence at a level sufficient to permit virus 
assembly therein, and wherein at least one existing 
codon of said parent nucleic acid sequence is rate 
limiting for the production said at least one protein 
10 to said level, said method including the step of 
introducing into said cell a nucleic acid sequence 
capable of expressing therein an isoaccepting transfer 
RNA specific for said at least one codon. 

19. A vector comprising a nucleic acid sequence 
15 according to any of claims 1 to 9 wherein said 

synthetic nucleic acid sequence is operably linked to 
one or more regulatory nucleic acid sequences. 

20. A pharmaceutical composition comprising a 
nucleic acid sequence according to any of claims 1 to 

20 9 together with a pharmaceutical^ acceptable carrier. 

21. A pharmaceutical composition comprising a 
vector according to claim 19 together with a 
pharmaceutically acceptable carrier. 

22. A cell or tissue comprising therein a 
25 nucleic acid sequence according to any of claims 1 to 

9. 

23. A cell or tissue comprising therein a 
vector according to claim 19. 

24. A cell or tissue resulting from a method 
30 according to any one of claims 10 to 18. 

25. Virus particles produced from a method 
according to claims 17 or 18. 
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SI Ser Lys £? Leu C^s Ser Glu Thr Tyr Val Gin Arg Lys Ser lie 
20 25 

ttt tat cat gca gaa acg gag cgc ctg eta act ata gga cat cca tat 
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225 230 235 

age atg ttc ttt ttt gca agg aaa gaa cag gtg tat gtt aga cac ate 
Ser Met Phe Phe Phe Ala Arg Lys Glu Gin Val Tyr Val Arg His lie 
245 250 2 

too acc aga ggg ggc teg gag aaa gaa gee cct acc aca gat ttt tat 
Trp Thr !2g HI lly Ser Glu Lys Glu Ala Pro Thr Thr Asp Phe Tyr 
260 265 270 

tta aag aat aat aaa ggg gat gee acc ctt aaa ata ccc agt gtg cat 
Leu Lys Asn Asn Lys Gly Asp Ala Thr Leu Lys He Pro Ser Val Hxs 

280 285 



275 



ttt ggt agt ccc agt ggc tea eta gtc tea act gat aat caa att ttt 
Phe Gly ser Pro Ser Gly Ser Leu Val Ser Thr Asp Asn Gin lie Phe 
290 295 300 



720 



768 



816 



864 



912 
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aac egg ccc tac tgg eta ttc cgt gec cag ggc atg aac aat gga att 
Asn Arg Pro Tyr Trp Leu Phe Arg Ala Gin Gly Met Asn Asn Gly 

310 315 JZU 



305 



qca tgg aat aat tta ttg ttt tta aca gtg ggg gac aat aca cgt ggt 
Ala Tr? Asn Asn Leu Leu Phe Leu Thr Val Gly Asp Asn Thr Arg Gly 
325 330 



act aat ctt acc ata agt gta gee tea gat gga acc cca eta aca gag 
Thr Asn Leu Thr He Ser Val Ala Ser Ab P Gly Thr Pro Leu Thr Glu 
340 345 350 

hat aat aqc tea aaa ttc aat gta tac cat aga cat atg gaa gaa tat 
T^r Asp Ser Ser Lys Phe Asn Val Tyr His Arg His Met Glu Glu Tyr 
355 360 365 

aag eta gec ttt ata tta gag eta tgc tet gtg gaa ate aca get caa 
Lys Leu Ala Phe lie Leu Glu Leu Cys Ser Val Glu lie Thr Ala Gin 

i - ion 



370 



375 380 



act gtg tea cat ctg caa gga ctt atg ccc tct gtg ctt gaa aat tgg 
Thr Vat Ser His Leu Gin Gly Leu Met Pro Ser Val Leu Glu Asn Trp 
385 390 395 400 



aaa ata agt gtg cag cct cct acc tea teg ata tta gag gac acc tat 
Glu lie S tl Oil Pro Pro Thr Ser Ser He Leu Glu Asp Thr Tyr 
405 410 



cac tat ata gag tct cct gca act aaa tgt gca age aat gta att cct 
Arg Sr lie Glu Ser Pro Ala Thr Lys Cys Ala Ser Asn Val lie Pro 
420 425 430 

gca aaa gaa gac cct tat gca ggg ttt aag ttt tgg aac ata gat ctt 
SI Lys Glu Asp Pro Tyr Ala Gly Phe Lys Phe Trp Asn lie Asp Leu 
435 440 445 

aaa aaa aaq ctt tct ttg gac tta gat caa ttt ccc ttg gga aga aga 
L " Tlu lyl Leu Ser Leu Asp Leu Asp Gin Phe Pro Leu Gly Arg Arg 
450 455 460 

t-t-t tta aca caq caa ggg gca gga tgt tea act gtg aga aaa cga aga 
Phe Leu III HI Gin SJ La Gly Cys Ser Thr Val Arg Lys Arg Arg 
465 470 475 480 

att age caa aaa act tec agt aag cct gca aaa aaa aaa aaa aaa taa 
lie Ser Gin Lys Thr Ser Ser Lys Pro Ala Lys Lys Lys Lys Lys 
485 490 495 



960 



1008 



1056 



1104 



1152 



1200 



1248 



1296 



1344 



1392 



1440 



1488 



<210> 2 
<211> 495 
<212> PRT 

<213> Bovine papillomavirus type 1 

Met°IlS Leu Trp Gin Gin Gly Gin Lys Leu Tyr Leu Pro Pro Thr Pro 
! 5 10 15 
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Val Ser Lys Val Leu Cys Ser Glu Thr Tyr Val Gin Arg Lys Ser lie 
20 25 30 

Phe Tyr His Ala Glu Thr Glu Arg Leu Leu Thr lie Gly His Pro Tyr 
3S 40 45 

Tyr Pro val Ser He Gly Ala Lys Thr Val Pro Lys Val Ser Ala Asn 
50 " 60 

Gin Tyr Arg Val Phe Lys He Gin Leu Pro Asp Pro Asn Gin Phe Ala 
65 7 ° 75 

Leu Pro Asp Arg Thr Val His Asn Pro Ser Lys Glu Arg Leu Val Trp 

85 90 95 

Pro Val He Gly Val Gin Val Ser Arg Gly Gin Pro Leu Gly Gly Thr 
100 105 HO 

Val Thr Gly His Pro Thr Phe Asn Ala Leu Leu Asp Ala Glu Asn Val 
115 120 125 

Asn Arg Lys Val Thr Thr Gin Thr Thr Asp Asp Arg Lys Gin Thr Gly 
130 "5 "0 

Leu Asp Ala Lys Gin Gin Gin lie Leu Leu Leu Gly Cys Thr Pro Ala 

145 150 155 I 50 

Glu Gly Glu Tyr Trp Thr Thr Ala Arg Pro Cys Val Thr Asp Arg Leu 
165 1™ 175 

Glu Asn Gly Ala Cys Pro Pro Leu Glu Leu Lys Asn Lys His He Glu 
180 I 85 I 90 

Asp Gly Asp Met Met Glu He Gly Phe Gly Ala Ala Asn Phe Lys Glu 
* 195 200 205 

He Asn Ala Ser Lys Ser Asp Leu Pro Leu Asp He Gin Asn Glu lie 
210 215 220 

Cys Leu Tyr Pro Asp Tyr Leu Lys Met Ala Glu Asp Ala Ala Gly Asn 
225 230 235 240 

Ser Met Phe Phe Phe Ala Arg Lys Glu Gin Val Tyr Val Arg His He 
245 250 255 

Tru Thr Arg Gly Gly Ser Glu Lys Glu Ala Pro Thr Thr Asp Phe Tyr 
260 265 270 

Leu Lys Asn Asn Lys Gly Asp Ala Thr Leu Lys He Pro Ser Val His 
275 280 285 

Phe Gly Ser Pro Ser Gly Ser Leu Val Ser Thr Asp Asn Gin He Phe 
290 295 300 

Asn Arg Pro Tyr Trp Leu Phe Arg Ala Gin Gly Met Asn Asn Gly lie 
305 310 315 
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Ala Trp Asn Asn Leu Leu Phe Leu Thr Val Gly Asp Asn Thr Arg Gly 
325 330 335 

Thr Asn Leu Thr He Ser Val Ala Ser Asp Gly Thr Pro Leu Thr Glu 
340 345 350 

Tyr Asp Ser Ser Lys Phe Asn Val Tyr His Arg His Met Glu Glu Tyr 
355 360 365 

Lys Leu Ala Phe He Leu Glu Leu Cys Ser Val Glu lie Thr Ala Gin 
370 375 380 



Thr Val Ser His Leu Gin Gly Leu Met Pro Ser Val Leu Glu Asn Trp 
385 



390 395 400 



Glu He Gly Val Gin Pro Pro Thr Ser Ser He Leu Glu Asp Thr Tyr 
405 410 415 

Ara Tyr He Glu Ser Pro Ala Thr Lys Cys Ala Ser Asn Val He Pro 
9 420 425 430 

Ala Lys Glu Asp Pro Tyr Ala Gly Phe Lys Phe Trp Asn He Asp Leu 
435 440 445 

Lys Glu Lys Leu Ser Leu Asp Leu Asp Gin Phe Pro Leu Gly Arg Arg 
450 455 460 

Phe Leu Ala Gin Gin Gly Ala Gly Cys Ser Thr Val Arg Lys Arg Arg 
465 470 475 480 

He Ser Gin Lys Thr Ser Ser Lys Pro Ala Lys Lys Lys Lys Lys 
485 490 495 



<210> 3 
<211> 1488 
<212> DNA 

<213> Artificial Sequence 

<220> 

<221> CDS 

<222> (1) . • (1488) 

<220> 

<223> Description of Artificial Sequence: Bovine 
papillomavirus type 1 LI open reading frame 
(humanized) 

<220> 

<223> Wild- type codons replaced with synonymous codons 
used at relatively high frequency by human genes 

<400> 3 

atg gcc ctg tgg cag cag ggc cag aag ctg tac ctg ccc cct acc ccc 
Met Ala Leu Trp Gin Gin Gly Gin Lys Leu Tyr Leu Pro Pro Thr Pro 
15 10 15 



48 
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gtg age aag 
Val Ser Lys 



ttt tat cat 
Phe Tyr His 
35 

tac ccc gtg 
Tyr Pro Val 
50 

cag tat agg 
Gin Tyr Arg 
65 

ctg cct gac 
Leu Pro Asp 



cca gtg ate 
Pro Val lie 



gtg act ggg 
Val Thr Gly 
115 

aat aga aaa 
Asn Arg Lys 
130 

ctg gat gec 
Leu Asp Ala 
145 

gaa ggg gaa 
Glu Gly Glu 



gaa aac ggc 
Glu Asn Gly 



gat ggg gat 
Asp Gly Asp 
195 

att aat gca 
He Asn Ala 
210 

tgc ctg tac 
Cys Leu Tyr 
225 



gtg ctt tgc 
Val Leu Cys 
20 

gca gaa acg 
Ala Glu Thr 



tec ate ggg 
Ser He Gly 



gtg ttc aaa 
Val Phe Lys 
70 

agg acc gtg 
Arg Thr Val 
85 

ggc gtg cag 
Gly Val Gin 
100 

cac ccc act 
His Pro Thr 



gtc acc acc 
Val Thr Thr 



aag cag cag 
Lys Gin Gin 
150 

tat tgg aca 
Tyr Trp Thr 
165 

gec tgc cct 
Ala Cys Pro 
180 

atg atg gaa 
Met Met Glu 



agt aaa tea 
Ser Lys Ser 



ccc gac tac 
Pro Asp Tyr 
230 



agt gaa acc 
Ser Glu Thr 
25 

gag cgc ctg 
Glu Arg Leu 
40 

gec aag act 
Ala Lys Thr 
55 

ate caa ctg 
He Gin Leu 



cac aac ccc 
His Asn Pro 



gtg tec aga 
Val Ser Arg 
105 

ttt aat get 
Phe Asn Ala 
120 

cag acc acc 

Gin Thr Thr 
135 

cag ate ctg 

Gin He Leu 



aca gec cgt 
Thr Ala Arg 



cct ctg gag 
Pro Leu Glu 
185 

att ggg ttt 
He Gly Phe 
200 

gat eta cct 
Asp Leu Pro 
215 

ctg aaa atg 
Leu Lys Met 



tat gtg caa 
Tyr Val Gin 



ctg acc ate 
Leu Thr He 



gtg cct aag 
val Pro Lys 
60 

cct gat ccc 
Pro Asp Pro 
75 

age aaa gag 
Ser Lys Glu 
90 

ggc cag cct 
Gly Gin Pro 



ttg ctt gat 
Leu Leu Asp 



gac gac agg 
Asp Asp Arg 
140 

ctg ctg ggc 
Leu Leu Gly 
155 

cca tgt gtg 
Pro Cys Val 
170 

ctg aaa aac 
Leu Lys Asn 



ggt gca gec 
Gly Ala Ala 



ctg gac ate 
Leu Asp He 
220 

get gag gac 
Ala Glu Asp 
235 



aga aaa age 
Arg Lys Ser 
30 

gga cac ccc 
Gly His Pro 
45 

gtg tec gee 
Val Ser Ala 



aat caa ttt 
Asn Gin Phe 



egg ctg gtg 
Arg Leu Val 
95 

ctg ggc ggc 
Leu Gly Gly 
110 

gca gaa aat 
Ala Glu Asn 
125 

aaa cag aca 
Lys Gin Thr 



tgt acc cct 
Cys Thr Pro 



acc gac cgt 
Thr Asp Arg 
175 

aag cac ate 
Lys His He 
190 

aac ttc aaa 
Asn Phe Lys 
205 

caa aat gag 
Gin Asn Glu 



gec gec ggc 
Ala Ala Gly 



att 96 
He 



tat 144 
Tyr 



aat 192 
Asn 



gca 240 
Ala 
80 

tgg 288 
Trp 



acc 336 
Thr 



gtg 384 
Val 



ggc 432 
Gly 



get 480 

Ala 

160 

eta 528 
Leu 



gaa 576 
Glu 



gaa 624 
Glu 



ate 672 
He 



aac 720 

Asn 

240 
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aqc atg ttc ttc ttc gcc agg aag gag cag gtg tac gtg aga cac ate 768 
Ser Met Phe Phe Phe Ala Arg Lys Glu Gin Val Tyr Val Arg His He 
245 250 255 

tgg acc aga ggc ggc tec gag aaa gaa gcc cct acc aca gat ttt tat 
Tr^ Thr Arg Gly Gly Ser Glu Lys Glu Ala Pro Thr Thr Asp Phe Tyr 
260 265 270 

ttq aag aac aac aag ggc gac gcc acc ctg aag ate ccc age gtg cac 
Leu Lys Asn Asn Lys Gly Asp Ala Thr Leu Lys lie Pro Ser Val His 
275 280 285 

ttc ggc age ccc age ggc tea eta gtg tec acc gac aac cag ate ttc 
Phe Gly Ser Pro Ser Gly Ser Leu Val Ser Thr Asp Asn Gin lie Phe 
290 295 300 

aac egg ccc tac tgg ctg ttc cgc gcc cag ggc atg aac aat gga att 
Asn Arq Pro Tyr Trp Leu Phe Arg Ala Gin Gly Met Asn Asn Gly He 
305 310 315 320 

qcc tgg aac aac ctg ctg ttc ctg acc gtg ggc gac aac aca cgt ggc 1008 
Ala Trp Asn Asn Leu Leu Phe Leu Thr Val Gly Asp Asn Thr Arg Gly 
325 330 335 



816 



864 



912 



960 



acc aac ctg acc ate age gtg gcc tec gat gga acc cca ctg acc gag 
Thr Asn Leu Thr He Ser Val Ala Ser Asp Gly Thr Pro Leu Thr Glu 
340 345 350 

tat gat age teg aaa ttc aac gtg tac cac aga cac atg gag gag tat 
Tyr Asp Ser Ser Lys Phe Asn Val Tyr His Arg His Met Glu Glu Tyr 
355 360 365 

aag eta gcc ttc ate ctg gag ctg tgc tec gtg gag ate acc gcc cag 
Lys Leu Ala Phe He Leu Glu Leu Cys Ser Val Glu He Thr Ala Gin 
370 375 380 

acc gtg tec cat ctg caa gga ctg atg ccc tec gtg ctg gag aat tgg 
Thr Val Ser His Leu Gin Gly Leu Met Pro Ser Val Leu Glu Asn Trp 
385 390 395 400 

gag ate ggc gtg cag ccc ccc acc tea teg ate ttg gag gac acc tac 
Glu He Gly Val Gin Pro Pro Thr Ser Ser He Leu Glu Asp Thr Tyr 
405 410 415 

cgc tac ate gag tec ccc gcc acc aag tgt gcc age aac gtg att cct 
Arq Tyr He Glu Ser Pro Ala Thr Lys Cys Ala Ser Asn Val He Pro 
420 425 430 

gca aaa gaa gac cct tat gca ggg ttt aag ttc tgg aac ate gac ctg 
Ala Lys Glu Asp Pro Tyr Ala Gly Phe Lys Phe Trp Asn He Asp Leu 
435 440 445 

aag gag aag ctg tct ctg gac ctg gat cag ttc ccc ttg ggc aga aga 
Lys Glu Lys Leu Ser Leu Asp Leu Asp Gin Phe Pro Leu Gly Arg Arg 
450 455 460 



1056 



1104 



1152 



1200 



1248 



1296 



1344 



1392 
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ttt ctg gcc cag cag ggg gcc ggc tgt tec acc gtg aga aaa cgc agg 
Phe Leu Ala Gin Gin Gly Ala Gly Cys Ser Thr Val Arg Lys Arg Arg 
465 470 475 480 

ate age cag aag acc tec age aag ccc gcc aag aag aag aaa aag taa 
He Ser Gin Lys Thr Ser Ser Lys Pro Ala Lys Lys Lys Lys Lys 
485 490 495 



1440 



1488 



<210> 4 
<2H> 495 
<212> PRT 

<213> Artificial Sequence 



<400> 4 , 

Met Ala Leu Trp Gin Gin Gly Gin Lys Leu Tyr Leu Pro Pro Thr Pro 

1 5 10 15 

Val Ser Lys Val Leu Cys Ser Glu Thr Tyr Val Gin Arg Lys Ser He 
20 25 30 

Phe Tyr His Ala Glu Thr Glu Arg Leu Leu Thr He Gly His Pro Tyr 



35 



40 



45 



Tyr Pro Val Ser He Gly Ala Lys Thr Val Pro Lys Val Ser Ala Asn 
50 55 60 

Gin Tyr Arg Val Phe Lys He Gin Leu Pro Asp Pro Asn Gin Phe Ala 
65 70 75 80 

Leu Pro Asp Arg Thr Val His Asn Pro Ser Lys Glu Arg Leu Val Trp 

85 90 95 

Pro Val He Gly Val Gin Val Ser Arg Gly Gin Pro Leu Gly Gly Thr 
100 105 HO 

Val Thr Gly His Pro Thr Phe Asn Ala Leu Leu Asp Ala Glu Asn Val 
115 120 125 

Asn Arg Lys Val Thr Thr Gin Thr Thr Asp Asp Arg Lys Gin Thr Gly 
130 135 140 

Leu Asp Ala Lys Gin Gin Gin He Leu Leu Leu Gly Cys Thr Pro Ala 
145 150 155 160 

Glu Gly Glu Tyr Trp Thr Thr Ala Arg Pro Cys Val Thr Asp Arg Leu 
165 170 175 

Glu Asn Gly Ala Cys Pro Pro Leu Glu Leu Lys Asn Lys His He Glu 
180 185 190 

Asp Gly Asp Met Met Glu He Gly Phe Gly Ala Ala Asn Phe Lys Glu 
195 200 205 

He Asn Ala Ser Lys Ser Asp Leu Pro Leu Asp He Gin Asn Glu He 
210 215 220 



SUBSTITUTE SHEET (RULE 26) 



WO 99/02694 



ix 



PCT/AU98/00530 



Cys Leu Tyr Pro Asp Tyr Leu Lys Met Ala Glu Asp Ala Ala Gly Asn 
225 230 235 240 

Ser Met Phe Phe Phe Ala Arg Lys Glu Gin Val Tyr Val Arg His He 
245 250 255 

Trp Thr Arg Gly Gly Ser Glu Lys Glu Ala Pro Thr Thr Asp Phe Tyr 
260 265 270 

Leu Lys Asn Asn Lys Gly Asp Ala Thr Leu Lys He Pro Ser Val His 
275 280 285 

Phe Gly Ser Pro Ser Gly Ser Leu Val Ser Thr Asp Asn Gin He Phe 
290 295 300 

Asn Arg Pro Tyr Trp Leu Phe Arg Ala Gin Gly Met Asn Asn Gly He 
305 310 315 320 

Ala Trp Asn Asn Leu Leu Phe Leu Thr Val Gly Asp Asn Thr Arg Gly 
325 330 335 

Thr Asn Leu Thr He Ser Val Ala Ser Asp Gly Thr Pro Leu Thr Glu 
340 345 350 

Tyr Asp Ser Ser Lys Phe Asn Val Tyr His Arg His Met Glu Glu Tyr 
355 360 365 

Lys Leu Ala Phe He Leu Glu Leu Cys Ser Val Glu He Thr Ala Gin 
370 375 380 

Thr Val Ser His Leu Gin Gly Leu Met Pro Ser Val Leu Glu Asn Trp 
385 390 395 400 

Glu He Gly Val Gin Pro Pro Thr Ser Ser He Leu Glu Asp Thr Tyr 
405 410 415 

Arg Tyr He Glu Ser Pro Ala Thr Lys Cys Ala Ser Asn Val He Pro 
420 425 430 

Ala Lys Glu Asp Pro Tyr Ala Gly Phe Lys Phe Trp Asn He Asp Leu 
435 440 445 

Lys Glu Lys Leu Ser Leu Asp Leu Asp Gin Phe Pro Leu Gly Arg Arg 
450 455 460 

Phe Leu Ala Gin Gin Gly Ala Gly Cys Ser Thr Val Arg Lys Arg Arg 
465 470 475 480 

He Ser Gin Lys Thr Ser Ser Lys Pro Ala Lys Lys Lys Lys Lys 
485 490 495 



<210> 5 
<211> 1410 
<212> DNA 

<213> Bovine papillomavirus type 1 
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<220> 

<221> CDS 

<222> (1) . . (1410) 



<220> 

<223> L2 open reading frame (wild- type) 
<400> 5 

atg agt gca cga aaa aga gta aaa cgt gcc age gec tat gac ctg tac 
Met Ser Ala Arg Lys Arg Val Lys Arg Ala Ser Ala Tyr Asp Leu Tyr 
15 10 15 

agg acc tgc aag caa gcg ggc aca tgt cca cca gat gtg ata cga aag 
Arg Thr Cys Lys Gin Ala Gly Thr Cys Pro Pro Asp Val He Arg Lys 
20 25 30 

gta gaa gga gat act ata gca gat aaa att ttg aaa ttt ggg ggt ctt 
Val Glu Gly Asp Thr He Ala Asp Lys He Leu Lys Phe Gly Gly Leu 
35 40 45 



gcc cct gca ata gtc act cct gat get gtt cct gca gat tea ggg ctt 
Ala Pro Ala He Val Thr Pro Asp Ala Val Pro Ala Asp Ser Gly Leu 
130 135 140 

gat gcc ctg tec ata ggt aca gac teg tec acg gag acc etc att act 
Asp Ala Leu Ser He Gly Thr Asp Ser Ser Thr Glu Thr Leu He Thr 
145 150 155 160 

ctg eta gag cct gag ggt ccc gag gac ata gcg gtt ctt gag ctg caa 
Leu Leu Glu Pro Glu Gly Pro Glu Asp He Ala Val Leu Glu Leu Gin 
165 170 175 

ccc ctg gac cgt cca act tgg caa gta age aat get gtt cat cag tec 
Pro Leu Asp Arg Pro Thr Trp Gin Val Ser Asn Ala Val His Gin Ser 
180 185 190 



48 



96 



144 



gca ate tac tta gga ggg eta gga ata gga aca tgg tct act gga agg 192 
Ala lie Tyr Leu Gly Gly Leu Gly He Gly Thr Trp Ser Thr Gly Arg 
50 55 60 

gtg gcc gca ggt gga tea cca agg tac aca cca etc cga aca gca ggg 240 
Val Ala Ala Gly Gly Ser Pro Arg Tyr Thr Pro Leu Arg Thr Ala Gly 
65 70 75 80 

tec aca tea teg ctt gca tea ata gga tec aga get gta aca gca ggg 288 
Ser Thr Ser Ser Leu Ala Ser He Gly Ser Arg Ala Val Thr Ala Gly 

85 90 95 

acc cgc ccc agt ata ggt gcg ggc att cct tta gac acc ctt gaa act 336 
Thr Arg Pro Ser He Gly Ala Gly He Pro Leu Asp Thr Leu Glu Thr 
100 105 HO 

ctt ggg gcc ttg cgt cca ggg gtg tat gag gac act gtg eta cca gag 384 
Leu Gly Ala Leu Arg Pro Gly Val Tyr Glu Asp Thr Val Leu Pro Glu 
115 120 125 



432 



480 



528 



576 



SUBSTITUTE SHEET (RULE 26) 



WO 99/02694 xj PCT/AU98/00530 



tct gca tac cac gcc cct ctg cag ctg caa teg tec att gca gaa aca 624 
Ser Ala Tyr His Ala Pro Leu Gin Leu Gin Ser Ser lie Ala Glu Thr 
195 200 205 

tct ggt tta gaa aat att ttt gta gga ggc teg ggt tta ggg gat aca 672 
Ser Gly Leu Glu Asn He Phe Val Gly Gly Ser Gly Leu Gly Asp Thr 
210 215 220 



gga gga gaa aac att gaa ctg aca tac ttc ggg tec cca cga aca age 
Gly Gly Glu Asn He Glu Leu Thr Tyr Phe Gly Ser Pro Arg Thr Ser 
225 230 235 240 

acg ccc cgc agt att gcc tct aaa tea cgt ggc att tta aac tgg ttc 
Thr Pro Arg Ser He Ala Ser Lys Ser Arg Gly He Leu Asn Trp Phe 
245 250 255 

agt aaa egg tac tac aca cag gtg ccc acg gaa gat cct gaa gtg ttt 
Ser Lys Arg Tyr Tyr Thr Gin Val Pro Thr Glu Asp Pro Glu Val Phe 
260 265 270 

tea tec caa aca ttt gca aac cca ctg tat gaa gca gaa cca get gtg 
Ser Ser Gin Thr Phe Ala Asn Pro Leu Tyr Glu Ala Glu Pro Ala Val 
275 280 285 

ctt aag gga cct agt gga cgt gtt gga etc agt cag gtt tat aaa cct 
Leu Lys Gly Pro Ser Gly Arg Val Gly Leu Ser Gin Val Tyr Lys Pro 
290 295 300 

gat aca ctt aca aca cgt age ggg aca gag gtg gga cca cag eta cat 
Asp Thr Leu Thr Thr Arg Ser Gly Thr Glu Val Gly Pro Gin Leu His 
305 310 315 320 

gtc agg tac tea ttg agt act ata cat gaa gat gta gaa gca ate ccc 
Val Arg Tyr Ser Leu Ser Thr He His Glu Asp Val Glu Ala He Pro 
325 330 335 

tac aca gtt gat gaa aat aca cag gga ctt gca ttc gta ccc ttg cat 
Tyr Thr Val Asp Glu Asn Thr Gin Gly Leu Ala Phe Val Pro Leu His 
340 345 350 



ggt gta cga aga age etc att cca act cga gaa ttt agt gca aca egg 
Gly Val Arg Arg Ser Leu He Pro Thr Arg Glu Phe Ser Ala Thr Arg 
385 390 395 400 

cct aca ggt gtt gta ace tat ggc tea cct gac act tac tct get age 
Pro Thr Gly Val Val Thr Tyr Gly Ser Pro Asp Thr Tyr Ser Ala Ser 
405 410 415 



720 



768 



816 



864 



912 



960 



1008 



1056 



gaa gag caa gca ggt ttt gag gag ata gaa tta gat gat ttt agt gag 1104 
Glu Glu Gin Ala Gly Phe Glu Glu lie Glu Leu Asp Asp Phe Ser Glu 
355 360 365 

aca cat aga ctg eta cct cag aac acc tct tct aca cct gtt ggt agt 1152 
Thr His Arg Leu Leu Pro Gin Asn Thr Ser Ser Thr Pro Val Gly Ser 
370 375 380 



1200 



1248 
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cca gtt act gac cct gat tct acc tct cct agt eta gtt ate gat gac 1296 
Pro Val Thr Asp Pro Asp Ser Thr Ser Pro Ser Leu Val lie Asp Asp 
420 425 430 

act act act aca cca ate att ata att gat ggg cac aca gtt gat ttg 1344 
Thr Thr Thr Thr Pro lie He He He Asp Gly His Thr Val Asp Leu 
435 440 445 

tac age agt aac tac acc ttg cat ccc tec ttg ttg agg aaa cga aaa 1392 
Tyr Ser Ser Asn Tyr Thr Leu His Pro Ser Leu Leu Arg Lys Arg Lys 
450 455 460 

aaa egg aaa cat gec taa 1410 
Lys Arg Lys His Ala 
465 470 



<210> 6 
<211> 469 
<212> PRT 

<213> Bovine papillomavirus type 1 
<400> 6 

Met Ser Ala Arg Lys Arg Val Lys Arg Ala Ser Ala Tyr Asp Leu Tyr 
15 10 15 

Arg Thr Cys Lys Gin Ala Gly Thr Cys Pro Pro Asp Val He Arg Lys 
20 25 30 

Val Glu Gly Asp Thr He Ala Asp Lys He Leu Lys Phe Gly Gly Leu 
35 40 45 

Ala He Tyr Leu Gly Gly Leu Gly He Gly Thr Trp Ser Thr Gly Arg 
50 55 60 

Val Ala Ala Gly Gly Ser Pro Arg Tyr Thr Pro Leu Arg Thr Ala Gly 
65 70 75 60 

Ser Thr Ser Ser Leu Ala Ser He Gly Ser Arg Ala Val Thr Ala Gly 

85 90 95 

Thr Arg Pro Ser He Gly Ala Gly He Pro Leu Asp Thr Leu Glu Thr 
100 105 110 

Leu Gly Ala Leu Arg Pro Gly Val Tyr Glu Asp Thr Val Leu Pro Glu 
115 120 125 

Ala Pro Ala He Val Thr Pro Asp Ala Val Pro Ala Asp Ser Gly Leu 
130 135 140 

Asp Ala Leu Ser lie Gly Thr Asp Ser Ser Thr Glu Thr Leu He Thr 
145 150 155 160 

Leu Leu Glu Pro Glu Gly Pro Glu Asp lie Ala Val Leu Glu Leu Gin 
165 170 175 
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Pro Leu Asp Arg Pro Thr Trp Gin Val Ser Asn Ala Val His Gin Ser 
180 185 190 

Ser Ala Tyr His Ala Pro Leu Gin Leu Gin Ser Ser lie Ala Glu Thr 
195 200 205 

Ser Gly Leu Glu Asn lie Phe Val Gly Gly Ser Gly Leu Gly Asp Thr 
210 215 220 

Gly Gly Glu Asn lie Glu Leu Thr Tyr Phe Gly Ser Pro Arg Thr Ser 
225 230 235 240 

Thr Pro Arg Ser lie Ala Ser Lys Ser Arg Gly lie Leu Asn Trp Phe 
245 250 255 

Ser Lys Arg Tyr Tyr Thr Gin Val Pro Thr Glu Asp Pro Glu Val Phe 
260 265 270 

Ser Ser Gin Thr Phe Ala Asn Pro Leu Tyr Glu Ala Glu Pro Ala Val 
275 280 285 

Leu Lys Gly Pro Ser Gly Arg Val Gly Leu Ser Gin Val Tyr Lys Pro 
290 295 300 

Asp Thr Leu Thr Thr Arg Ser Gly Thr Glu Val Gly Pro Gin Leu His 
305 310 315 320 

Val Arg Tyr Ser Leu Ser Thr He His Glu Asp Val Glu Ala He Pro 
325 330 335 

Tyr Thr Val Asp Glu Asn Thr Gin Gly Leu Ala Phe Val Pro Leu His 
340 345 350 

Glu Glu Gin Ala Gly Phe Glu Glu He Glu Leu Asp Asp Phe Ser Glu 
355 360 365 

Thr His Arg Leu Leu Pro Gin Asn Thr Ser Ser Thr Pro Val Gly Ser 
370 375 380 

Gly Val Arg Arg Ser Leu He Pro Thr Arg Glu Phe Ser Ala Thr Arg 
385 390 395 400 

Pro Thr Gly Val val Thr Tyr Gly Ser Pro Asp Thr Tyr Ser Ala Ser 
405 410 415 

Pro Val Thr Asp Pro Asp Ser Thr Ser Pro Ser Leu Val He Asp Asp 
420 425 430 

Thr Thr Thr Thr Pro He He He lie Asp Gly His Thr Val Asp Leu 
435 440 445 

Tyr Ser Ser Asn Tyr Thr Leu His Pro Ser Leu Leu Arg Lys Arg Lys 
450 455 460 

Lys Arg Lys His Ala 
465 
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<210> 7 

<211> 1410 

<212> DNA 

<213> Artificial Sequence 

<220> 

<221> CDS 

<222> (1) . - (1410) 

<220> 

<223> Description of Artificial Sequence: Bovine 
papillomavirus type 1 L2 open reading frame 
(humanized) 

<220> 

<223> Wild- type codons replaced with synonymous codons 
used at relatively high frequency by human genes 

<400> 7 

atg age gec cgc aag aga gtg aag cgc gec age gec tac gac ctg tac 48 
Met Ser Ala Arg Lys Arg Val Lys Arg Ala Ser Ala Tyr Asp Leu Tyr 
15 10 15 

agg acc tgc aag cag gec ggc aca tgt cca cca gat gtg ate cga aag 96 
Arg Thr Cys Lys Gin Ala Gly Thr Cys Pro Pro Asp Val lie Arg Lys 
20 25 30 

gtg gag ggc gac acc ate gee gac aag ate ctg aag ttc ggc ggc ctg 144 
Val Glu Gly Asp Thr lie Ala Asp Lys lie Leu Lys Phe Gly Gly Leu 
35 40 45 

gee ate tac ctg ggc ggc ctg ggc ate gga aca tgg tct acc ggc agg 192 
Ala lie Tyr Leu Gly Gly Leu Gly lie Gly Thr Trp Ser Thr Gly Arg 
50 55 60 

gtg gee gee ggc ggc tea cca agg tac acc cca ctg cgc acc gec ggc 240 
Val Ala Ala Gly Gly Ser Pro Arg Tyr Thr Pro Leu Arg Thr Ala Gly 
65 70 75 80 

tec acc tec tec ctg gee tec ate gga tec aga gec gtg acc gec ggg 2 88 
Ser Thr Ser Ser Leu Ala Ser lie Gly Ser Arg Ala Val Thr Ala Gly 

85 90 95 

acc cgc ccc tec ate ggc gcg ggc ate cct ctg gac acc ctg gaa act 336 
Thr Arg Pro Ser lie Gly Ala Gly lie Pro Leu Asp Thr Leu Glu Thr 
100 105 110 

ctt ggg gec ctg cgc cct ggc gtg tac gag gac acc gtg ctg ccc gaa 3 84 
Leu Gly Ala Leu Arg Pro Gly Val Tyr Glu Asp Thr Val Leu Pro Glu 
115 120 125 

gee cct gee ate gtg acc cct gac gee gtg cct gca gac tec ggc ctg 432 
Ala Pro Ala lie Val Thr Pro Asp Ala Val Pro Ala Asp Ser Gly Leu 
130 135 140 
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gac gcc ctg tec 
Asp Ala Leu Ser 
145 

ctg ctg gag cct 
Leu Leu Glu Pro 



ccc ctg gac cgc 
Pro Leu Asp Arg 
180 

tct gcc tac cac 
Ser Ala Tyr His 
195 

tct ggt tta gaa 
Ser Gly Leu Glu 
210 

ggc ggc gag aac 
Gly Gly Glu Asn 
225 

acc ccc cgc tec 
Thr Pro Arg Ser 



age aag egg tac 
Ser Lys Arg Tyr 
260 

tec tec cag acc 
Ser Ser Gin Thr 
275 

ctg aag ggc cct 
Leu Lys Gly Pro 
290 

gat acc ctg acc 
Asp Thr Leu Thr 
305 

gtg agg tac tec 
Val Arg Tyr Ser 



tac acc gtg gat 
Tyr Thr Val Asp 
340 

gag gag cag gcc 
Glu Glu Gin Ala 
355 



ate ggc aca 
He Gly Thr 
150 

gag ggc ccc 
Glu Gly Pro 
165 

cca acc tgg 
Pro Thr Trp 



gcc cct etc 
Ala Pro Leu 



aat att ttt 
Asn He Phe 
215 

ate gag ctg 
He Glu Leu 
230 

ate gcc tec 

He Ala Ser 
245 

tac acc cag 

Tyr Thr Gin 



ttc gcc aac 
Phe Ala Asn 



age ggc cgc 
Ser Gly Arg 
295 

aca cgt age 
Thr Arg Ser 
310 

ctg tec acc 
Leu Ser Thr 
325 

gag aac acc 
Glu Asn Thr 



ggc ttc gag 
Gly Phe Glu 



gac tec tec acc 
Asp Ser Ser Thr 
155 

gaa gac ata gcc 
Glu Asp He Ala 
170 

cag gtg age aat 
Gin Val Ser Asn 
185 

cag ctg caa tec 
Gin Leu Gin Ser 
200 

gta gga ggc teg 
Val Gly Gly Ser 



acc tac ttc ggc 
Thr Tyr Phe Gly 
235 

aag tec cgc ggc 
Lys Ser Arg Gly 
250 

gtg ccc acc gaa 
Val Pro Thr Glu 
265 

ccc ctg tac gag 
Pro Leu Tyr Glu 
280 

gtg ggc ctg tec 
val Gly Leu Ser 



ggc aca gag gtg 
Gly Thr Glu Val 
315 

ate cat gag gat 
He His Glu Asp 
330 

cag ggc ctg gcc 
Gin Gly Leu Ala 
345 

gag ate gag etc 
Glu He Glu Leu 
360 



gag acc ctg 
Glu Thr Leu 



gtg ctg gaa 
Val Leu Glu 



get gtg cac 
Ala Val His 
190 

tec ate gcc 
Ser He Ala 
205 

ggt tta ggg 
Gly Leu Gly 
220 

tec ccc cgc 
Ser Pro Arg 



ate ctg aac 
He Leu Asn 



gat ccc gaa 
Asp Pro Glu 
270 

gcc gag ccc 
Ala Glu Pro 
285 

cag gtg tac 
Gin Val Tyr 
300 

ggc ccc cag 
Gly Pro Gin 



gtg gag get 
Val Glu Ala 



ttc gtg ccc 
Phe Val Pro 
350 

gac gat ttc 
Asp Asp Phe 
365 



ate acc 480 
He Thr 
160 

etc cag 528 

Leu Gin 

175 

cag tec 576 
Gin Ser 



gag aca 624 
Glu Thr 



gat acc 672 
Asp Thr 



acc age 720 
Thr Ser 
240 

tgg ttc 768 
Trp Phe 
255 

gtg ttc 816 
Val Phe 



gcc gtg 864 
Ala Val 



aag cct 912 
Lys Pro 



ctg cat 960 
Leu His 
320 

ate ccc 1008 

He Pro 

335 

ctg cat 1056 
Leu His 



age gag 1104 
Ser Glu 
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acc cat cgc ctg ctg ccc cag aac acc tec tec acc ccc gtg ggc age 1152 
Thr His Arg Leu Leu Pro Gin Asn Thr Ser Ser Thr Pro Val Gly Ser 
370 375 380 



ggc gtg cgc aga age ctg ate cct acc cga gag ttc age gee acc egg 
Gly Val Arg Arg Ser Leu He Pro Thr Arg Glu Phe Ser Ala Thr Arg 

390 395 400 



385 



cct acc ggc gtg gtg acc tac ggc tec ccc gac acc tac tec get age 
Pro Thr Gly Val Val Thr Tyr Gly Ser Pro Asp Thr Tyr Ser Ala Ser 
405 410 415 

ccc gtg acc gac cct gat tct acc tct cct age ctg gtg ate gac gac 
Pro Val Thr Asp Pro Asp Ser Thr Ser Pro Ser Leu Val He Asp Asp 



420 



425 430 



acc acc acc acc ccc ate ate ate ate gac ggc cac aca gtg gat ctg 
Thr Thr Thr Thr Pro He He He He Asp Gly His Thr Val Asp Leu 
435 440 445 

tac age age aac tac acc ctg cat ccc tec ctg ctg agg aag cgc aag 
Tyr Ser Ser Asn Tyr Thr Leu His Pro Ser Leu Leu Arg Lys Arg Lys 
450 455 460 

aag cgc aag cat gee taa 
Lys Arg Lys His Ala 
465 470 



1200 



1248 



1296 



1344 



1392 



1410 



<210> 8 

<211> 469 

<212> PRT 

<213> Artificial Sequence 



<400> 8 

M*»t ser Ala Ara Lvs Arg Val Lys Arg Ala Ser Ala Tyr Asp Leu Tyr 
*~1 '5 10 15 

Arg Thr Cys Lys Gin Ala Gly Thr Cys Pro Pro Asp Val He Arg Lys 
20 25 30 

Val Glu Gly Asp Thr He Ala Asp Lys He Leu Lys Phe Gly Gly Leu 
35 40 45 

Ala He Tyr Leu Gly Gly Leu Gly He Gly Thr Trp Ser Thr Gly Arg 
50 55 60 

Val Ala Ala Gly Gly Ser Pro Arg Tyr Thr Pro Leu Arg Thr Ala Gly 
65 70 75 80 

Ser Thr Ser Ser Leu Ala Ser He Gly Ser Arg Ala Val Thr Ala Gly 

85 90 95 

Thr Arg Pro Ser He Gly Ala Gly He Pro Leu Asp Thr Leu Glu Thr 
100 105 HO 
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Leu Gly Ala Leu Arg Pro Gly Val Tyr Glu Asp Thr Val Leu Pro Glu 
115 120 125 

Ala Pro Ala lie Val Thr Pro Asp Ala Val Pro Ala Asp Ser Gly Leu 
130 135 140 

Asp Ala Leu Ser He Gly Thr Asp Ser Ser Thr Glu Thr Leu He Thr 
145 150 155 160 

Leu Leu Glu Pro Glu Gly Pro Glu Asp He Ala Val Leu Glu Leu Gin 
165 170 175 

Pro Leu Asp Arg Pro Thr Trp Gin Val Ser Asn Ala Val His Gin Ser 
180 185 190 

Ser Ala Tyr His Ala Pro Leu Gin Leu Gin Ser Ser He Ala Glu Thr 
195 200 205 

Ser Gly Leu Glu Asn He Phe Val Gly Gly Ser Gly Leu Gly Asp Thr 
210 215 220 

Gly Gly Glu Asn He Glu Leu Thr Tyr Phe Gly Ser Pro Arg Thr Ser 
225 230 235 240 

Thr Pro Arg Ser He Ala Ser Lys Ser Arg Gly He Leu Asn Trp Phe 
245 250 255 

Ser Lys Arg Tyr Tyr Thr Gin Val Pro Thr Glu Asp Pro Glu Val Phe 
260 265 270 

Ser Ser Gin Thr Phe Ala Asn Pro Leu Tyr Glu Ala Glu Pro Ala Val 
275 280 285 

Leu Lys Gly Pro Ser Gly Arg Val Gly Leu Ser Gin Val Tyr Lys Pro 
290 295 300 

Asp Thr Leu Thr Thr Arg Ser Gly Thr Glu Val Gly Pro Gin Leu His 
305 310 315 320 

Val Arg Tyr Ser Leu Ser Thr He His Glu Asp Val Glu Ala He Pro 
325 330 335 

Tyr Thr Val Asp Glu Asn Thr Gin Gly Leu Ala Phe Val Pro Leu His 
340 345 350 

Glu Glu Gin Ala Gly Phe Glu Glu He Glu Leu Asp Asp Phe Ser Glu 
355 360 365 

Thr His Arg Leu Leu Pro Gin Asn Thr Ser Ser Thr Pro Val Gly Ser 
370 375 380 

Gly Val Arg Arg Ser Leu He Pro Thr Arg Glu Phe Ser Ala Thr Arg 
385 390 395 400 

Pro Thr Gly Val Val Thr Tyr Gly Ser Pro Asp Thr Tyr Ser Ala Ser 
405 410 415 
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Pro Val Thr Asp Pro Asp Ser Thr 
420 

Thr Thr Thr Thr Pro He He He 
435 440 

Tyr Ser Ser Asn Tyr Thr Leu His 
450 455 

Lys Arg Lys His Ala 
465 



Ser Pro Ser Leu Val He Asp Asp 
425 430 

He Asp Gly His Thr Val Asp Leu 
445 

Pro Ser Leu Leu Arg Lys Arg Lys 
460 



<210> 9 
<211> 717 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Aequorea 
victoria gfp gene (humanized) 

<220> 

<221> CDS 

<222> (1) . . (717) 

Ug°agc aag ggc gag gaa ctg ttc act ggc gtg gtc cca att etc gtg 48 
Met Ser Lys Gly Glu Glu Leu Phe Thr Gly Val Val Pro lie Leu Val 
! 5 10 15 

gaa ctg gat ggc gat gtg aat ggg cac aaa ttt tct gtc age gga gag 
Glu Leu Asp Gly Asp Val Asn Gly His Lys Phe Ser Val Ser Gly Glu 
20 25 30 

agt gaa ggt qat gec aca tac gga aag etc acc ctg aaa ttc ate tgc 
Gly Glu Gly Asp Ala Thr Tyr Gly Lys Leu Thr Leu Lys Pne lie Cys 
35 40 45 

acc act gga aag etc cct gtg cca tgg cca aca ctg gtc act acc ttc 
Thr Thr Gly Lys Leu Pro Val Pro Trp Pro Thr Leu Val Thr Thr Phe 
50 55 60 



tct tat ggc gtg cag tgc ttt tec aga tac cca gac cat atg aag cag 
Ser Tyr Gly Val Gin Cys Phe Ser Arg Tyr Pro Asp His Met Lys Gin 
65 



70 75 80 



cat qac ttt ttc aag age gee atg ccc gag ggc tat gtg cag gag aga 
His Asp Phe Phe Lys Ser Ala Met Pro Glu Gly Tyr Val Gin Glu Arg 

85 90 95 

acc ate ttt ttc aaa gat gac ggg aac tac aag acc cgc get gaa gtc 
Thr He Phe Phe Lys Asp Asp Gly Asn Tyr Lys Thr Arg Ala Glu Val 
100 105 110 

aag ttc gaa ggt gac acc ctg gtg aat aga ate gag ctg aag ggc att 
Lys Phe Glu Gly Asp Thr Leu Val Asn Arg He Glu Leu Lys Gly He 



96 



144 



192 



240 



288 



336 



384 
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115 



120 125 



<210> 10 
<211> 238 
<212> PRT 

<213> Artificial Sequence 



Met°ser°Lys Gly Qlu Glu Leu Phe Thr Gly Val Val Pro lie Leu Val 
1 



5 10 15 



480 



aac ttt aag gag gat gga aac att etc ggc cac aag ctg gaa tac aac 432 
Asp III Lys Oil Asp Sly Asn lie Leu Gly His Lys Leu Glu Tyr Asn 
130 135 140 

tat aac tec cac aat gtg tac ate atg gee gac aag caa aag aat ggc 
Tyr Asn Ser His Asn Val Tyr lie Met Ala Asp Lys Gin Lys Asn Gly 
!4 5 150 155 

ate aag gte aac ttc aag ate aga cac aac att gag gat gga tec gtg 528 
lie Lys val Asn Phe Lys lie Arg His Asn lie Glu Asp Gly Ser Val 
165 170 175 

caq ctg gee gac cat tat caa cag aac act cca ate ggc gac ggc cct 576 
III III Ala Lp His Tyr Gin Gin Asn Thr Pro lie Gly Asp Gly Pro 
180 185 I 90 

gtg etc etc cca gac aac cat tac ctg tec ace cag tet gee ctg tct 624 
Val Leu Leu Pro Asp Asn His Tyr Leu Ser Thr Gin Ser Ala Leu Ser 
19S 200 205 

aaa gat ccc aac gaa aag aga gac cac atg gte ctg ctg gag ttt gtg 672 
Lys Asp Pro Asn Glu Lys Arg Asp His Met Val Leu Leu Glu Phe Val 
210 215 220 

ace get get ggg ate aca cat ggc atg gac gag ctg tac aag tga 
Thr Ala Ala Gly He Thr His Gly Met Asp Glu Leu Tyr Lys 
225 230 235 



Glu Leu Asp Gly Asp Val Asn Gly His Lys Phe Ser Val Ser Gly Glu 
20 25 30 

Gly Glu Gly Asp Ala Thr Tyr Gly LyB Leu Thr Leu Lys Phe He Cys 
35 40 45 

Thr Thr Gly Lys Leu Pro Val Pro Trp Pro Thr Leu Val Thr Thr Phe 
50 55 60 

Ser Tyr Gly Val Gin Cys Phe Ser Arg Tyr Pro Asp His Met Lys Gin 
65 70 75 80 

His Asp Phe Phe Lys Ser Ala Met Pro Glu Gly Tyr Val Gin Glu Arg 

85 90 95 

Thr He Phe Phe Lys Asp Asp Gly Asn Tyr Lys Thr Arg Ala Glu Val 
100 105 HO 



717 
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Lys Phe Glu Gly 
115 

Asp Phe Lys Glu 
130 

Tyr Asn Ser His 
145 

lie Lys Val Asn 



Gin Leu Ala Asp 
180 

Val Leu Leu Pro 
195 

Lys Asp Pro Asn 
210 

Thr Ala Ala Gly 
225 



Asp Thr Leu Val 
120 

Asp Gly Asn lie 
135 

Asn Val Tyr He 
150 

Phe Lys He Arg 
165 

His Tyr Gin Gin 



Asp Asn His Tyr 
200 

Glu Lys Arg Asp 
215 

He Thr His Gly 
230 



Asn Arg He Glu 



Leu Gly His Lys 
140 

Met Ala Asp Lys 
155 

His Asn He Glu 
170 

Asn Thr Pro lie 
185 

Leu Ser Thr Gin 



His Met Val Leu 
220 

Met Asp Glu Leu 
235 



Leu Lys Gly He 
125 

Leu Glu Tyr Asn 



Gin Lys Asn Gly 
160 

Asp Gly Ser Val 
175 

Gly Asp Gly Pro 
190 

Ser Ala Leu Ser 
205 

Leu Glu Phe Val 



Tyr Lys 



<210> 11 

<211> 717 

<212> DNA 

<213> Artificial Sequence 

<220> 

<221> CDS 

<222> (1) . . (717) 

<220> . 
<223> Description of Artificial Sequence: Synthetic gfp 
gene (Papillomavirusized) 

<220> , 
<223> Codons of humanized gfp gene replaced with 

synonymous codons used at relatively high 

frequency by papillomavirus genes 

It^ag^aaa ggg gaa gaa eta ttt aca ggg gtg gtg cct ata eta gtg 48 
Met Ser Lys Gly Glu Glu Leu Phe Thr Gly Val Val Pro He Leu Val 
1 5 10 15 



gaa eta gat ggg gat gtg aat ggg cac aaa ttt tct gtc agt ggg gaa 
Glu Leu Asp Gly Asp Val Asn Gly His Lys Phe Ser Val Ser Gly Glu 
20 25 30 

ggg gaa ggg gat gca aca tat ggg aaa eta aca eta aaa ttt ata tgc 
Gly Glu Gly Asp Ala Thr Tyr Gly Lys Leu Thr Leu Lys Phe He Cys 
35 40 45 



96 



144 
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aca aca ggg aaa eta cct gtg cca tgg cct aca eta gtg aca aca ttt 
Thr Thr Gly LyB Leu Pro Val Pro Trp Pro Thr Leu Val Thr Thr Phe 



50 



55 60 



agt tat ggg gtg caa tgc ttt agt aga tat cct gat cat atg aaa caa 
Ser Tyr Gly Val Gin Cys Phe Ser Arg Tyr Pro Asp His Met Lys Gin 
65 70 75 80 

cat qat ttt ttt aaa agt gca atg ccc gag ggg tat gtg caa gaa aga 
His Asp Phe Phe Lys Ser Ala Met Pro Glu Gly Tyr Val Gin Glu Arg 

85 90 95 

aca ata ttt ttt aaa gat gat ggg aat tat aaa aca aga gca gaa gtc 
Thr lie Phe Phe Lys Asp Asp Gly Asn Tyr Lys Thr Arg Ala Glu Val 
100 105 HO 

aaa ttt gaa ggg gat aca eta gtg aat aga ata gag etc aaa ggg ata 
Lys Phe Glu Gly Asp Thr Leu Val Asn Arg He Glu Leu Lys Gly He 
115 120 125 

qat ttt aaa gaa gat ggg aat ata eta ggg cat aaa eta gaa tat aat 
Asp Phe Lys Glu Asp Gly Asn lie Leu Gly His Lys Leu Glu Tyr Asn 
130 135 140 

tat aat agt cat aat gtg tat ata atg gca gat aaa caa aaa aat ggg 
Tvr Asn Ser His Asn Val Tyr He Met Ala Asp Lys Gin Lys Asn Gly 
lis 150 155 160 

ata aaa gtg aat ttt aaa ata ata aga cat ata gaa gat gga tec gtg 
lie Lys Val Asn Phe Lys He He Arg His He Glu Asp Gly Ser Val 
165 170 175 

caa eta gca gat cat tat caa caa aat aca cct ata ggg gat ggg cct 
Gin Leu Ala Asp His Tyr Gin Gin Asn Thr Pro He Gly Asp Gly Pro 
180 185 , 190 

gtg eta eta cct gat aac cat tat eta agt aca caa agt gca eta agt 
Val Leu Leu Pro Asp Asn His Tyr Leu Ser Thr Gin Ser Ala Leu Ser 
195 200 205 

aaa gat cct aat gaa aaa aga gat cat atg gtg eta etc gag ttt gtg 
Lys Asp Pro Asn Glu Lys Arg Asp His Met Val Leu Leu Glu Phe Val 
210 215 220 

aca gca gca ggg ata aca cat ggg atg gat gaa eta tat aaa tga 
Thr Ala Ala Gly He Thr His Gly Met Asp Glu Leu Tyr Lys 
225 230 235 



192 



240 



286 



336 



384 



432 



480 



528 



576 



624 



672 



717 



<210> 12 
<211> 238 
<212> PRT 

<213> Artificial Sequence 



<400> 12 — „ . 

Met Ser Lys Gly Glu Glu Leu Phe Thr Gly Val Val Pro He Leu Val 
15 10 I 5 
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Glu Leu Asp Gly Asp Val Asn Gly His Lys Phe Ser Val Ser Gly Glu 
20 25 30 

Gly Glu Gly Asp Ala Thr Tyr Gly Lys Leu Thr Leu Lys Phe lie Cys 
35 40 45 

Thr Thr Gly Lys Leu Pro Val Pro Trp Pro Thr Leu Val Thr Thr Phe 
. 50 55 60 

Ser Tyr Gly Val Gin Cys Phe Ser Arg Tyr Pro Asp His Met Lys Gin 
65 70 75 80 

His Asp Phe Phe Lys Ser Ala Met Pro Glu Gly Tyr Val Gin Glu Arg 

85 90 95 

Thr lie Phe Phe Lys Asp Asp Gly Asn Tyr Lys Thr Arg Ala Glu Val 
100 105 HO 

Lys Phe Glu Gly Asp Thr Leu Val Asn Arg lie Glu Leu Lys Gly lie 
115 120 125 

Asp Phe Lys Glu Asp Gly Asn lie Leu Gly His Lys Leu Glu Tyr Asn 
130 135 140 

Tyr Asn Ser His Asn Val Tyr He Met Ala Asp Lys Gin Lys Asn Gly 
145 150 155 160 

He Lys Val Asn Phe Lys He He Arg His He Glu Asp Gly Ser Val 
165 170 175 

Gin Leu Ala Asp His Tyr Gin Gin Asn Thr Pro He Gly Asp Gly Pro 
180 185 190 

Val Leu Leu Pro Asp Asn His Tyr Leu Ser Thr Gin Ser Ala Leu Ser 
195 200 205 

Lys Asp Pro Asn Glu Lys Arg Asp His Met Val Leu Leu Glu Phe Val 
210 215 220 

Thr Ala Ala Gly He Thr His Gly Met Asp Glu Leu Tyr Lys 
225 230 235 



<210> 13 
<211> 17 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 

Oligonucleotide specific for Ala(GCA) 

<400> 13 

taaggactgt aagactt 



SUBSTITUTE SHEET (RULE 26) 



WO 99/02694 



xxiii 



PCT/AU98/00530 



<210> 14 
<211> 17 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 

Oligonucleotide specific for Arg(CGA) 

<400> 14 

cgagccagcc aggagtc 



<210> 15 
<211> 17 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 

Oligonucleotide specific for Asn(AAC) 

<400> 15 

ctagattggc aggaatt 



<210> 16 
<211> 17 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 

Oligonucleotide specific for Asp(GAC) 

<400> 16 

taagatatat agattat 



<210> 17 
<211> 17 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 

Oligonucleotide specific for Cys (TGC) 

<400> 17 

aagtcttagt agagatt 



<210> 18 

<211> 17 

<212> DNA 

<213> Artificial 



Sequence 



SUBSTITUTE SHEET (RULE 26) 



WO 99/02694 



xxiv 



PCT/AU98/00530 



<220> 

<223> Description of Artificial Sequence: 

Oligonucleotide specific for Glu(GAA) 

<400> 18 

tatttctaca cagcatt 



<210> 19 
<211> 17 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 

Oligonucleotide specific for Gln(CAA) 

<400> 19 

ctaggacaat aggaatt 



<210> 20 
<211> 17 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 

Oligonucleotide specific for Gly(GGA) 

<400> 20 

tactctcttc tgggttt 



<210> 21 
<211> 17 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 

Oligonucleotide specific for His (CAC) 

<400> 21 

tgccgtgact cggattc 



<210> 22 
<211> 17 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 

Oligonucleotide specific for Ile(ATC) 

<400> 22 
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tagaaataag agggctt 



XXV 

17 



<210> 23 
<211> 17 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 

Oligonucleotide specific for Leu(CTA) 

<400> 23 

tacttttatt tggattt 



<210> 24 
<211> 17 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 

Oligonucleotide specific for Leu(CTT) 

<400> 24 

tattagggag aggattt 



<210> 25 
<211> 17 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 

Oligonucleotide specific for Lys (AAA) 

<400> 25 

tcactatgga gatttta 



<210> 26 
<211> 17 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 

Oligonucleotide specific for Lys (AAG) 

<400> 26 

cgcccaacgt ggggctc 



<210> 27 
<211> 17 
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<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 

Oligonucleotide specific for Met (elong) 

<400> 27 

tagtacggga aggattt 



<210> 28 
<211> 17 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 

Oligonucleotide specific for Phe (TTC) 

<400> 28 

tgtttatggg atacaat 



<210> 29 
<211> 17 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 

Oligonucleotide specific for Pro (CCA) 

<400> 29 

tcaagaagaa ggagcta 



<210> 30 
<211> 17 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 

Oligonucleotide specific for Pro (CCD 

<400> 30 

gggctcgtcc gggattt 



<210> 31 

<211> 17 

<212> DNA 

<213> Artificial 



Sequence 



<220> 

<223> Description of Artificial Sequence: 
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Oligonucleotide specific for Ser (AGO 
<400> 31 

ataagaaagg aagatcg 17 



<210> 32 
<211> 17 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 

Oligonucleotide specific for Thr(ACA) 

<400> 32 

tgtcttgaga agagaag 



<210> 33 
<211> 17 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 

Oligonucleotide specific for Tyr(TAC) 

<400> 33 

tggtaaaaag aggattt 



<210> 34 
<211> 17 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 

Oligonucleotide specific for Val (GTA) 

<400> 34 

tcagagtgtt cattggt 
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